Two examples from the computer science review and publication process
A few days ago, I posted about getting the Ph.D. in computer science. As part of that post, I mentioned that I'd publish & discuss some of the reviews my papers have received. These aren't so nasty that they're funny, but I decided to post the full submission/review/submission/review/final as a way to also help show what the publication and revision process looks like. I probably owe a better explanation of the delta between the papers, but - hey, the SIGCOMM, USENIX ATC, and SEA (algorithms) deadlines are all this week. :)
I've deliberately picked two papers with very different initial reviews. Both were eventually accepted, and neither was accepted upon first submission. One received a best paper award. Both have over 100 citations. I've picked these not because they're representative -- I've written papers with low citation counts also -- but because they illustrate the wide disparity in review constructiveness and tone that even decent, publication-worthy papers can receive.
The full submitted and accepted papers and reviews are:
Footnote: A great comment from +Michael Wood-Vasey on the g+ discussion for this post:
I try hard to teach the PhD students the general lesson that even though someone is an insensitive jerk with a personal vendetta against your advisor, no concern for human dignity and feelings, and acts with a primary agenda of promoting their own greatness, they still often have intellectually useful suggestions.
Or said more succinctly:
Just because they're a jerk doesn't mean they're wrong.
I've deliberately picked two papers with very different initial reviews. Both were eventually accepted, and neither was accepted upon first submission. One received a best paper award. Both have over 100 citations. I've picked these not because they're representative -- I've written papers with low citation counts also -- but because they illustrate the wide disparity in review constructiveness and tone that even decent, publication-worthy papers can receive.
The full submitted and accepted papers and reviews are:
Best-Path vs. Multi-Path Overlay Routing (Infocom2003, then IMC 2003) | Submitted 1: Infocom | Reviews 1 | Submitted 2: IMC | Reviews 2 | Final camera-ready |
FAWN: A Fast Array of Wimpy Nodes (OSDI 2008, then SOSP 2009) | Submitted 1: OSDI | Reviews 1 | Submitted 2: SOSP | Reviews 2 | Final camera-ready |
Let's start with the harsh review - review 1 for best-path vs multipath (initially titled "Loss-Optimized Routing in Overlay Networks").
All recommendations for this paper:
- 1 ("Definite Reject"),
- 2 ("Likely reject, top 60% but not top 40%")
- 4 ("Likely accept, top 20% but not top 10%")
- 2
- 3
Contribution: "The paper does not propose any new idea or methodology."
Weaknesses: "The paper is very superficial." [ouch]
"The data set studied is rather small." [This part of the review I object to: I agree with the reviewer that it would have been better to have a richer dataset, but at the time, it wasn't yet possible to obtain one. At that time, the RON testbed was one of the largest overlay network testbeds in the world; PlanetLab was just starting to emerge from being a gleam in Larry Peterson's eye. Soon after this, of course, PlanetLab took the crown and has tightly held on to it, and enabled order-of-magnitude larger experiments than those we were able to run. ]
"Presentation wise, I find the paper to bog down in repeated details."
I'll give this review credit on one front: It didn't go personal. Some have. I can't find it, but IIRC, Valerie Aurora once collected snippets of the most harsh reviews at a major systems conference, and presented them during the outrageous opinions session. They were harsh. I recall one - funny, but ouch if you were the student who got that review, that said,
"Weaknesses: this system attempts to achieve something extremely undesirable.
Strengths: It fails to achieve its undesirable goal."
Strengths: It fails to achieve its undesirable goal."
Back to ours: "The paper does not present a significant advance, as stated by the authors them selves (first step);" (sic) (the paper says "This work is only a first step.") Ouch. I hate getting penalized for being honest. Yes, I removed that phrasing for the next version we submitted.
Was this reviewer wrong? Maybe, maybe not - but this is an example of the kind of pretty harsh, "this is basically a worthless paper" review you're likely to encounter from time to time.
For the IMC version, we changed the focus of the paper to emphasize the measurement aspect as the fundamental contribution. Because we had the time, we also added a new, much larger, measurement dataset, that had 8x more samples than the largest of our previous two datasets. We had, during that time, added more nodes to the set - including some international hosts, which helped address some of the criticism from the Infocom reviews. And we added one of my favorite improvements to the dataset - the time-separated packets on the same path. In all, the paper and data we submitted to IMC was substantially more interesting than that which we sent to Infocom the previous year, and it enabled us to answer several more interesting questions than we'd tried to hit the first time. That first review was pretty harsh, but the rejection and feedback helped turn this into a better chunk of work.
Paper review 2: My favorite rejection yet
I'm serious about that - we deliberately acknowledged the help of the OSDI reviewers who rejected our paper in our SOSP camera-ready, because they gave us some of the most supportive and constructive critical feedback I've received. I've now had several such experiences, such as +Mike Dahlin 's shepherding of our COPS paper at SOSP11, as well as many not-so-great reviewing experiences. The positive ones stick in your mind and, hopefully, create a good incentive to emulate them. As a reviewer, know that (sometimes), your review can and will result in substantial improvements to the work, even going so far as to act to move the research itself forward. Not always - sometimes you write what you think is a brilliantly great review and the authors will ignore it. But that's life.
Some things worth picking out of the reviews for OSDI:
"A detailed discussion about how to ensure consistency would enrich the paper;" and
"Your footnote about weak consistency is disappointing. Do the applications (large DNS zone, e-commerce catalog) OK with weak consistency? Please discuss why weak consistency is OK, or if it's not OK please implement sufficient consistency for your applications."
(The reviewers were absolutely right. Within the cluster setting, there was no need to abandon strong consistency. This led in large part to the last part of +Amar Phanishayee 's thesis, in which he proved the correctness of his new method for chain replication on a consistent hashing ring.)
"As it stands, it will need serious amounts of both rewriting and (more importantly) real technical work to make it into the truly great paper it should (and perhaps could) be"
If you have the chance to honestly say something like that in a paper review, instead of just being harsh, do so. Anyone submitting a paper is inviting criticism not just of the paper, but of the work, and when your paper is rejected, you always end up asking "is this work worth continuing?" It's very helpful as a reviewer to be clear when you're talking about the _paper as submitted_ being bad or the idea in general being bad. Don't assume the authors are as robust as you - they may be first or second year Ph.D. students submitting their first paper. You probably freaked out a bit the first time you had a paper rejected -- I know I did. I think that one of my Sigcomm rejections as a Ph.D. student involved a few tears.
Summary: "The PC liked much of this paper, but felt it was premature. Please focus on finishing the work itself (e.g., consistent remote update with failures), bringing out what was interesting/novel from the perspective of software architecture, and scaling to a larger number of (wimpy) nodes. The PC looks forward to seeing the improved/mature version of the work at the next major conference."
That was one review where we read it, and sprinted back to the lab to try to make the paper the best damn paper we could write.
The delta between the OSDI and SOSP submissions is pretty large. We took the cluster from 8 to 21 nodes, but most importantly, we implemented enough of the cluster partitioning and replication functionality to be able to do experiments such as observing the cluster-wide throughput while adding a new node. We drastically overhauled the way we thought about and talked about the TCO evaluation. We removed the handwavy discussion of the N log N cache size, because we hadn't proved that it was correct, and that ended up turning into a separate SOCC paper of its own. (Our number was right, but our thought about why it was right turned out to be wrong!) We added the protocol for chain replication on a ring, which turned into 1/3rd of a thesis for Amar. And we added some memory-saving tricks on the hash table, which became the precursor to SILT.
Further reading: SIGMOD 2008, How Not To Review A Paper - it was written for a reason.
Some caveats: Not all publication attempts happen like this. Papers get accepted the first time. Papers get rejected so many times they eventually get killed. Sometimes the delta between submissions is way smaller than in either of these two papers. Your mileage will vary. They key is not to be surprised by anything you see in a review. Sometimes reviews really suck. Sometimes you want to hug them and make the reviewer a co-author. But the quality of the reviews you get is out of your control (mostly: If you submit a really bad paper you're likely to get hastily written reviews.); your task is to use everything you get back to your maximum advantage in improving your research, and to keep your spirits up through the process so that you've got the energy to do so.
Footnote: A great comment from +Michael Wood-Vasey on the g+ discussion for this post:
I try hard to teach the PhD students the general lesson that even though someone is an insensitive jerk with a personal vendetta against your advisor, no concern for human dignity and feelings, and acts with a primary agenda of promoting their own greatness, they still often have intellectually useful suggestions.
Or said more succinctly:
Just because they're a jerk doesn't mean they're wrong.
A plausible lesson from the second review to students (and perhaps advisors): please avoid submitting half-baked papers. You are just making the peer review process more miserable.
ReplyDeleteHi, @Anon:1:17am - while my good sense suggests I shouldn't feed the trolls, I believe that your comment overlooks some deeper underlying tension in publication. For example, in the facebook discussion thread, David Brumley linked to a post by Merkle about the invention of public key crypto: http://www.merkle.com/1974/ . There's a strong tension between "publish fast" (to not get scooped) and "publish perfect", and, if you've ever served on a program committee, you'll have also seen the strong tension between "take interesting work!" and "take perfect work!". It's not clear at all that there's a perfect answer here, and there is therefore a lot of variance between PCs and between individual reviewers about their preferences.
ReplyDeleteThe pain of the peer review process is, I think, mostly a different topic than about a few "potentially very good" papers getting submitted with work to go. It's one that's been covered a lot elsewhere that I don't think I have much to contribute to yet.
Great blog post! High quality helpful reviews are an essential part of the scientific process. One of the reasons I stopped participating in program committees is that my career no longer allowed me the time to write quality reviews.
ReplyDeleteI don't think it was me (Valerie Aurora) who collected and showed paper reviews - although my memory *is* terrible so if someone had a record of it happening I'd believe it. I hope the person who did gets credit!