Joined and Served: Thoughts on the Value of Peer Review in CS

Tuesday, May 17, 2011

Thoughts on the Value of Peer Review in CS

This is a bit of a random first post, but it seemed like a blog was the best place to say something like this, so here it is. Stay tuned for future posts about goings on in my little corner of the CS/database research community.

There are lots of problems with the peer-review system as it is currently implemented for computer science conferences, and especially within the database community, where I do most of my publication. Jeff Naughton recently gave a keynote at ICDE [Link to PDF of his slides] describing the issues in detail, but briefly, the central issue seems to revolve around the fact that review quality is low. It's not uncommon for reviews to be frustratingly short, non-specific, or demonstrate serious misunderstanding on the part of a reviewer. Most CS researchers (not just in the database community) have review horror stories.

As an example, here's the entire content of the "weak points" and "detailed comments" sections of a review we recently received, from a reviewer I like to call "Dr. Specificity":

Major weak points: Various tradeoffs and heuristic choices are interesting, but the paper lacks novel conceptual ideas.

Detailed comments: Interesting well-written paper on an important problem. I enjoyed reading it.

This sort of vague, wishy-washy review makes me think the Dr. is trying out his or her new random review generator, or maybe just cutting and pasting the same review into every paper. (As an aside, I'd like to say that the vast majority of reviews I receive are not like this, especially in the top conferences, and are generally fair and thoughtful.)

When you only receive 3 reviews for a paper, one careless review like this can easily skew the average score for a paper. This effect is exacerbated by the lack of face-to-face meetings of program committees in the database community, where such reviews could be discounted. Instead, there's an "electronic meeting", where reviewers are allowed to read other reviews, and are encouraged to have discussion with each other. Unfortunately, some reviewers don't participate, and it can be the case that papers are accepted based simply on whether their average score is above or below some arbitrary threshold.

There are lots of ways one might fix this system (add more reviews, have face-to-face meetings, hold reviewers accountable in some way, get better PC chairs, etc.) Many of these require reviewers to do more work, so the net result is grumbling and minor tweaks to the system that haven't fixed the problem (which people have been complaining about since I published my first paper 10+ years ago.)

In his keynote, one possible solution Jeff proposed was to eliminate peer review altogether. Several young researchers also came out in favor of this idea on Twitter (@daniel_abadi and @sudiptdas). The argument goes something as follows: now that we're unencumbered by actual printed proceedings, there's no limit on the number of papers that can be accepted, so why not just accept them all? No more crappy reviews, and no more grief and consternation about having your paper rejected! Put everything up on a website, and let The Interwebs sort it out.

I believe this is a terrible idea, as I'll try to describe.

First something that most people who've never served on a program committee don't know: most submissions really aren't very good (except, of course, my rejected papers, which are all under-appreciated works of groundbreaking Science!). For a database conference, I typically rate about 1 in 6 or 7 papers a weak accept or higher, even when I'm trying hard to give authors the benefit of the doubt and not be overly critical (I'm generally a more positive than average reviewer.) It's not unusual for a PC chair from a conference with 400-500 submissions to struggle to fill a program of 60-70 papers. Weeding out these lower quality papers is the primary role of the committee, and even broken reviewers like Dr. Specificity usually agree about them. Winnowing down the papers in this way allows the community to pay attention to the ideas that are more complete, well formed, or have the potential for impact.

Second, conferences only have so many slots for talks. Someone has to decide who gets to talk, which appears to require the equivalent of a program committee.

Third, even an imperfect filter like the system we have now causes researchers to work extremely hard to write up their ideas in the clearest, most cogent way possible. Eliminating the review system will discourage people from working hard on their papers, and inhibit the exchange of ideas in the community. Additionally, often rejected papers are revised and resubmitted in much better shape than they were before. This is an important part of the process, which would be lost by accepting everything.

Fourth, having publications in top conferences is one of the important quantifiable ways in which students and faculty are evaluated. It's an imperfect metric -- I'm certainly not a fan of "counting papers" and most reasonable schools don't do this -- but I firmly believed a good researcher should consistently be able to get his or her best ideas published in a top place. Accepting everything belittles the efforts of those doing the best work. (Also, though this perhaps shouldn't be a primary consideration, in academia we've fought very hard to educate tenure committees on the value of conference publications in computer systems; we'll look pretty silly if we suddenly declare conference publications aren't worth anything!)

Fifth, for academics, publication provides a form of validation, and a way to measure progress and success. I often find that after a student has a couple of papers under the belt, they are more confident and assertive, which makes them better researchers. If papers have no value, this important psychological benefit will be lost.

In summary: we need to improve review quality is CS. It can be frustrating, even devastating, to be rejected at the hands of inadequate reviews. But that doesn't mean we should discard peer review altogether, as it provides a number of important benefits.

10 comments:

Adam MarcusMay 18, 2011 at 6:33 AM
This comment has been removed by the author.
ReplyDelete
Replies
Adam MarcusMay 18, 2011 at 6:36 AM
Welcome to blog-o-land!

In addition to your points supporting the conference submission model, I think it's also important for researchers to have deadlines for some completed, well-written paper. In my limited experience, the project gets clearer once you put it on paper, and the level of agreement and confidence in the arguments you and your co-authors feel increases as you approach a deadline.

If we want to keep the conference model, we might be able to improve it. Here are some ideas from someone who has never been on a committee:

- We can reduce the amount of work each reviewer/PC member has to do. David Karger argues with data that we can likely make do with less than three reviewers per paper [1].

- Every submitter that gets a bad review seems to want to rate their review. It's cheap and easy, so why not collect the data on what happens if you allow this? Heck, there are papers [2] and libraries [3] to detect biased reviewers for us.

- We can offer more granularity in accept/reject decisions. I've heard that SIGGRAPH treats all accepts equally with respect to placement in the proceedings, but selects different papers for different forms of presentation at the conference. This might ruffle feathers, but you could likely accept more papers if some could be presented in a demo session rather than a 20-minute talk.

- Going a bit deeper into reconstructing how paper reviews are done, we could consider Jens Dittrich's idea for paper bricks [4]. If a section really bothers a reviewer so much, what if they could eliminate it without killing off a great idea?

[1] http://groups.csail.mit.edu/haystack/blog/2010/04/13/chi-do-we-really-need-three-reviewers-for-every-paper/

[2] http://pages.stern.nyu.edu/~panos/publications/hcomp2010.pdf

[3] http://code.google.com/p/get-another-label/

[4] http://www.sigmod.org/publications/sigmod-record/1012/pdfs/06.forum.dittrich.pdf
ReplyDelete
Replies
Daniel AbadiMay 19, 2011 at 6:41 AM
Let the flame war begin: http://dbmsmusings.blogspot.com/2011/05/why-sam-madden-is-wrong-about-peer.html
ReplyDelete
Replies
bharathvMay 19, 2011 at 8:17 AM
I guess the main motive behind A+ conferences like sigmod ,vldb is to introduce ground-breaking research to the outside world. Many of the ideas in today's database systems are as a result of some of the wonderful papers in those conferences. This main motive itself is violated if there is no peer review as people will be confused what to read and what to pickup from a large number of papers (majority of which might be meaningless incase there is no peer review) and good work doesn't get sufficient recognition or popularity and I am sure this degrades the value of those conferences.
ReplyDelete
Replies
AnonymousMay 19, 2011 at 9:25 AM
Nature had an open peer review trial back in 2006. Sounds like a good idea, but for that to sustain, I think the open peer review may have to be mandatory for all submissions and PC members rather than optional one way or the other.

http://www.nature.com/nature/peerreview/index.html
ReplyDelete
Replies
VasanthMay 19, 2011 at 10:45 AM
Sam,

What is the societal cost of truthiness permeating computer science research due to lack of peer review? I can see some fields where its potentially catastrophic (genetics, evolutionary psychology etc.). In most branches of engineering and computer science, it seems to me that the in the worst case, we have sub optimal systems being built ignoring lessons learned in the past.

Given that shouldn't we go to the youtube model where people publish everything and come up with innovative means for discovery and validation? That's how most books are deemed successful (Dan Brown?). Why should papers be different? The only issue to me is whether the persistence of false claims in non-peer reviewed academic papers cause lasting damage. Frankly, I don't see that to be the case in the areas of cs/ee that I'm most familiar with.
ReplyDelete
Replies
Sam MaddenMay 19, 2011 at 10:53 AM
@ Vasanth -- One concern with eliminating peer review is that it turns publication into a popularity contest. I'm concerned famous people will have their papers read, and non-famous people will be ignored. It's critical for innovation for the less well-known people to have their ideas seen, and peer-review is a key way to identify those ideas. That's what surprises me about more junior researchers advocating eliminating peer review -- they stand to suffer the most!

Looking at the publications from SIGMOD 2009 (http://db.csail.mit.edu/sigmod09dist.pdf) at least one of the top 5 most cited papers was by an author who had never published in SIGMOD before, which I think argues that peer review works!

Also, no one is stopping people from posting their work on the web now -- in fact, many researchers do put their pre-prints on their websites and talk about them in their blogs already (but I bet most of those pre-prints are essentially ignored!)
ReplyDelete
Replies
grundprinzipMay 19, 2011 at 12:14 PM
I can only agree, I am pro peer review because in most of the cases it helps and helps to keep a certain standard, that helps to judge the publications. There are just to many conferences where you can get everything published. Top conference help you to separate your ideas from the "noise".

Of course reviews cause grief because it always seems that the paper was "almost" accepted and it is only rejected because some minor point was missing. But this could be fought with publishing the reviews plus the submitted version.

@Adam: I don't like the paper bricks idea because every paper should make it's own story and this includes introduction, related work, evaluation etc. Wasn't this the original idea of conference workshops to be more narrow or even "easier" and help you to incremental get your idea right?
ReplyDelete
Replies
Prof. Dr. Jens DittrichMay 20, 2011 at 12:22 AM
@grundprinzip: one of the major problems with the current peer review system is exactly that papers are expected to contain a full-story. This is a huge drawback and obstacle; it is NOT an advantage!

A full-story may be very hard to come up with in the first place - especially for young PhD students who are unaware of all the hidden rules of paper writing. Another drawback of full-story papers is that the 8-12 page paper will be judged as a whole. So even if some parts of the paper are great and others are substandard, it is likely that the substandard pieces will kill the entire paper.

Paper bricks fixes this. With paper bricks you can concentrate on the actual contributions and not spend your time selling it with a nice story, putting it into another context, inventing yet another niche. Your are back to spending more time with the actual technical contribution rather than story telling.

@Sam: I agree that eliminating peer review is not gonna work. Who is gonna sieve through the flood of papers? I do not see this happening.

A major principal of reviewing is _editing_. The _editing_ process has to be improved to remove the variance and randomness in acceptance decisions. How come the quality of accepted works at top conferences varies that much? This has to be fixed.

I strongly believe that this can be fixed easily. This is how:

http://www.sigmod.org/publications/sigmod-record/1012/pdfs/06.forum.dittrich.pdf
ReplyDelete
Replies
Sam MaddenMay 20, 2011 at 7:56 AM
Jens -- I think its an intriguing model, and I'd support trying it at something like CIDR. I do feel like it's hard to write, e.g., a P paper without a lot of supporting text describing the idea, and I think our community has been cultured to not value P enough in the overall process.

A related way to address the problem of our community requiring full papers iincrease the quality venues that publish partial work of the sort you advocate. CIDR was supposed to be an I or and I + PS conference, but has become increasingly focused on systems-oriented complete papers. Most of the workshops (where one would normally publish a HLSI or D paper) are not high enough quality for top researchers to consider submitting too.
ReplyDelete
Replies

Add comment