Anonymity of peer review reports ‘definitely’ enables egregious behavior

Added, from Jenny Saul: “Those who want more will want to look at Carole Lee and Christian Shunn’s paper on philosophy review practices. A key point that comes out there is how much nastier philosophers are than other reviewers studied.”

In the last couple years, I have presided over or assisted in peer-review processes for journal issues, anthologies, and conferences in Philosophy, with one consistently repeated shock across all venues, at least in my limited experience so far: It seemed to me as if anonymized peer-review seemed to bring out something vindictive in almost half of referees. Everyone who’s had an infamous “Reviewer #2” experience may be nodding right now, but I did not expect this. (I’ve gotten my own wee share of mean reviews, yes. But I am still surprised.) It caused me to seriously question whether doubly anonymous peer review is proven to be effective and good. I also thought that perhaps my impression was idiosyncratic.

I went looking for research to reaffirm the worth of peer-review, but I found little empirical verification that peer-review in journals achieves desired ends. I was relieved to find Hilda Bastian’s recent PLOS blog post, “Weighing Up Anonymity and Openness in Publication Peer Review,” in which she announced she had “taken a deep dive into this literature.” She only makes three unqualified statements, and the first of them is there is not a lot of great data:

But first, what evidence do we have that masking the identities of authors and peer reviewers achieves what it is meant to?

Well, it’s complicated. Which means it really needs a solid, up-to-date systematic review… We don’t have an overwhelming evidence basis for anything.

Ouch. That gets me right in the justified true belief. Her second firm finding confirmed something I’ve always longed to resist when students and colleagues allege it, just because it’s rather depressing (condensed below to avoid Bastian’s penchant for referring to anonymity as “blind”):

Institutionalizing anonymity [is] only partially successful at hiding authors’ identities, and mostly only when people in their field don’t know what authors have been working on.

Admittedly, she focuses on biomedical publications, but her review of the evidence includes non-biomed pubs, notably Budden’s (2008) comparative study suggesting that Behavioral Ecology saw more women published after changing to doubly anonymized peer review (which we have previously posted on here). She does not find that this study compellingly establishes that anonymizing authors reduces gender bias, although she notes evidence that at some science journals, “odds are stacked against women,” and there are “clear signs of other biases that have been shown at some journals,” notably status bias.

The only other really conclusive finding she offers is one that underlines the problem which sent me on my hunt:

On the other hand, the anonymity of peer review reports definitely enables negative, and even egregious, behavior.

Take heart, those of you with Reviewer #2 scars! You are not alone. Peer reviewers were more likely to be courteous when they, the reviewers, did not have anonymity:

Peer reviewers were more likely to substantiate the points they made when they knew they would be named. They were especially likely to provide extra substantiation if they were recommending an article be rejected, and they knew their report would be published if the article was accepted anyway.
In some studies, when the reviewers knew they would be named, they were likely to be more courteous or regarded as helpful by the authors.
There’s no support here for the concern that naming peer reviewers leads to systematically less critical reviews – and some support for improvement.
There was one large effect: many peer reviewers declined the invitation to peer review when they knew there was a chance they would be named – especially when they knew their colors would be nailed to the public mast if the article was published.

The results of Bastian’s investigations give me some hope that it is possible to gather evidence helpful to imagining better systems of quality-control and publication. I remain committed to anonymizing authors, since status bias seems no better to me than gender bias. But the porousness of author-identity masking, and the conduct of anonymized referees, gives me food for future thought.

25 thoughts on “Anonymity of peer review reports ‘definitely’ enables egregious behavior

  1. I remain committed to anonymizing authors, since status bias seems no better to me than gender bias.

    I think that a similar thought cuts in the other direction, though. I know that I, a person of relatively low professional status, would be much less likely to accept refereeing requests if I knew that the authors, who may well be people I need the favor or good will of, or at least not the ill-will of, would know it was _me_ who said their paper needed more work, wasn’t original enough, or whatever. (There could be a temptation to suck up, too, I’m sure, though I’m less worried about that.) Unless we don’t want junior (or less) people to do any refereeing, it seems that there is pretty good reason to keep the referees anonymous as well.

  2. Such interesting and important issues…

    Those who want more will want to look at Lee and Shunn’s paper on philosophy review practices: http://faculty.washington.edu/c3/Lee_Schunn_2011.pdf. A key point that comes out there is how much nastier philosophers are than other reviewers studied.

    I worry a lot about non-anonymous reviewing for the reason Matt gives, and because it’s already really hard to find reviewers. A more direct intervention to produce more helpful reviews would be explicit editorial guidance regarding this *and* perhaps enforcement– e.g. asking reviewers to rewrite. The BPA/SWIP guidelines suggest an explicit editorial policy on reviews, and that one might want to model this policy on Cognition’s excellent one: http://bpa.ac.uk/uploads/Good%20Practice%20Scheme/Cognition%20reviewing%20policy.pdf. A sample:

    Reviewers have a responsibility both to the science and to the authors who are trying to advance that science. This responsibility includes helping the author better his/her paper and, if necessary, better his/her science. Too often, papers attract negative reviews that fail to provide constructive advice on how to better the presentation of the research, or how to better the research itself. Too often, reviewers confuse bad presentation with bad science. Sometimes, of course, it is very difficult to know how best to advise an author of a particularly weak paper. But in these cases, the tone of the review is as important as its content.
    We would ask you, as a reviewer, to please keep these points in mind, and remember that the role of the editors and the reviewers is as much to reach a consensus on how the author could improve the impact of their research as it is to reach a consensus on whether a paper should be accepted, sent back for revision, or rejected. Too often, we hear our colleagues (or even ourselves) refer to their experiences of the editorial process (across a range of journals) as ranging from unconstructive to confrontational. If the author can respect the editorial process as being cooperative and constructive, rather than confrontational, the journal as a whole, and its reviewers, will be held in greater respect.

  3. Matt, clarify: How would you know that the authors are people you need the favor or good-will of? I’m suggesting a system in which peer-reviewers’ identities are known but authors’ identities are kept from reviewers. If the driving concern is that one needs the favor or good-will of others, then wouldn’t this support the contention that signing one’s name to a referee report entails courtesy and respect in the course of rejection? I disagree that the risk of offending someone with courteous and respectful criticism is a sufficiently weighty counter-reason compared to the high likelihood of egregious and crummy behavior of anonymous referees.

    I agree that refereeing should be done by the tenured, anyway. I was hardly ever asked to referee when junior, and said no when I was. But consider my post in conjunction with Justin’s at Daily Nous on “Getting Credit for Peer Review” http://dailynous.com/2015/07/03/getting-credit-for-peer-review/
    and entertain a possible world in which refereeing was a worthy, credit-gaining, good thing to do.

  4. Thanks for the provision of the policy at Cognition’s website, Jenny! (Also, I should just add your reference to the Lee-Shunn paper to the post, will do.) Agreed that it is currently exceedingly difficult to find reviewers. I’ve been amazed how many ‘declines’ I can get on one article request, and mutter curses at the peer-review process when the number goes over 25. However, I don’t think the difficulty of finding reviewers is a good reason not to change the system. On the contrary, I think the difficulties that editors have would be greatly alleviated by improved systems including public and widely shared referee banks by name, publication of referees in conjunction with Justin’s idea at Daily Nous of credit or prestige for peer-review, and so on. I don’t believe we’ve done all we can, as a profession, to improve peer-review burden-sharing.

  5. Hi Kate, well, first I like to believe that I use (appropriate amounts of) courtesy and respect in all of my referee reports already, though of course many people don’t. But, as to your question, sometimes it’s easy to know who the author is, or have a good idea, especially in fairly small fields. (I’ve refereed for several very good journals that explicitly say that, even if you know or have a good idea who the author is, please don’t say ‘no’ just on that ground, as it’s too hard to get qualified referees.) More generally, even if I don’t know if the person whose paper I’m reading is someone I might need the good will or favor of at some point, it doesn’t seem unreasonable, especially very early in one’s career (or before one’s career has really even started) to not want to take risks in annoying “important” people by rejecting their papers or criticizing their arguments. I’m sure we all know people, often “important” ones, who don’t take criticism well, even when it is appropriately respectful. I could easily imagine myself thinking, “why risk it?”, and just saying no if it was going to be revealed that I was the referee. Maybe the risk is small for established people (though even there, I’m not sure), and maybe it comes out balanced properly for the profession as a whole, but I’m not sure it’s fair to ask particular individuals to take it on. I do know I’d be hesitant to take on the risk, and I suspect I’d not be alone.

  6. Hi Kate,

    I think that releasing the names of the peer reviewers might be useful, but only if it were done after the review was complete.

    If the names were released before the review was complete than the name of the reviewer, rather than the quality of the suggestions, might affect the uptake the author gives to the review. For example, if an author received a review from a reviewer whose work they really respect they might be overly inclined to accept the advice, even if some of it might not be that great in light of the author’s particular goals for the paper. In contrast, if the review came from someone who is not well known, or not well respected by the author (or in general) then the author might dismiss even very useful advice.

    Implicit biases and status biases might also play a role in which advice gets serious consideration and uptake and which advice is dismissed.

  7. I’m sympathetic with Matt’s concern. Even if information is only revealed of the referee to the author, wouldn’t there be anxiety about backlash for a negative referee report? I would think especially those from underrepresented groups would be reasonably anxious about a negative report.

  8. But Sally, junior scholars need peer-reviewed publications, as authors, much much more often than they are tasked to be the referees. I gotta say, as a member of an underrepresented group, I was a walking imposter syndrome on the tenure-track, and I was routinely floored into not even trying when I got a devastating referee report. This isn’t good. And I know I wasn’t alone. The possibility of backlash from a well-written, respectful, negative report by a tenured professor does not seem as weighty to me as the definitive finding that the current system enables and structurally supports egregious and crappy behavior by referees toward scholar-authors who are inequitably vulnerable to the egregious behavior.

    Re: Matt’s “it doesn’t seem unreasonable, especially very early in one’s career (or before one’s career has really even started) to not want to take risks in annoying “important” people by rejecting their papers or criticizing their arguments,” right, agreed, which is why tenured people can and should (and do) take on the vast majority of peer reviewing. Again, to use my parallel, I was somewhat hesitant to write a book-review as a junior scholar for that reason; my name was on a critical assessment of someone’s work, and I did so hoping it wouldn’t come back to bite me. You provide excellent reasons why the tenured should do most of the heavy lifting.

  9. While anonymity might enable egregious behavior, I wonder if the problem is even bigger than that, Kate. Specifically, I wonder if *philosophy* enables egregious behavior. I certainly can imagine some folks being perfectly fine attaching their names to nasty reviews–they may even pride themselves on the cleverness of their nasty comments.

    But perhaps not only releasing the author’s name *after* revisions are submitted as Bakka suggests, but also publishing the referee report *alongside* the article with the referee named as the author of the report would make some headway while simultaneously dealing with the concerns raised by Bakka, Matt, and Sally. Not knowing who wrote the report would take care of implicit bias on the part of the author and knowing the report would be published alongside the finished article might take some of the nastiness out of the reviewer (“will my clever nastiness appear so clever alongside a paper that was deemed worthy of publication?”) . Also, if the name of the referee is only released after the revisions have been worked through, perhaps there would be less to fear in terms of backlash. I know I feel very differently about my referee reports after I have sat with them and worked through my paper again than when I first read them. An added bonus: those of us who sweat over writing our referee reports would get some credit for the work we put into them (it takes serious philosophical work to get inside someone else’s argument and figure out how to make *their* argument stronger!) AND publishing the reports with the paper would remind us all that thinking well is a collaborative endeavor (and not the work of a single individual).

  10. Bakka, the Lee & Schunn (2011) paper supports your speculation that philosophy, in particular, tends toward the vicious, yes! Your thoughts on the credit-for-labor involved in peer-reviewing resonate with mine, and I’ve wondered more than once if the answer isn’t — ultimately — the death of journals in favor of something like online professional workshopping mediated by a site-editor. Public vetting may do more to advance good philosophy than our semi-antiquated journal processes. It strikes me too, when I read Hilda Bastian’s discussion of the reasons for peer-review in biomed pubs, that maybe peer-review doesn’t even have the same applicability in Philosophy, I mean, maybe it’s just the wrong mechanism, depending on what it is we think we’re doing here.

  11. Restricting the job of reviewing to the tenured would exclude people who work in institutions which don’t have a tenure system. Outside the USA, that’s a lot of institutions.

  12. billwringe, agreed, which is why I wouldn’t (and didn’t) recommend its restriction.

  13. In an age of social audit and accountability posturing,its strange that arguments fr masked review terrorism and secret voting anonymity is intimately held high, with weak justification. Recall the SOCIAL TEXT scandal for a more momentous trophy of a review or no review nonsense. With double or triple review procedures how many pathbreaking, originally free & robust papers are coming out,frm whre? Method doesnt produce substance;method produces more methods. However,to democratize review procedures, it hs to be fully open to be fair (the argument tht interests may trump knowldge is an external argument &cd b tempered only thru public mechanisms of answerability); full justification n authentication hs to be asked for and the author hs to b allowed to challenge the report if s/he is so willing(CSSH) once used to offer ths opportunity). The editors then hv to decide on the force of the better argument–again with end-justification.

  14. The hope of eliminating biases is an important reason for supporting an anonymous review process, insofar as that actually leads to better outcomes. But here is an interesting thought to consider. What if the egregious behavior fostered by anonymous review actually ends up preventing less advantaged people from publishing simply in virtue of preventing many less people from otherwise justly publishing and making everyone miserable, then the impact the risk of disproportionately affecting implicit/status biases occurring without it actually would? Moreover, knowing identifying information would help socially conscious journals improve representation, and would help the greater community measure when and how they fail to do so more accurately.

  15. Reblogged this on peakmemory and commented:
    “anonymized peer-review seemed to bring out something vindictive in almost half of referees”
    I certainly have has that experience. The peer review system is in need of serious reform. Another, emerging issue, is how peer review is counted as part of academic workload. If your university gives you little credit for serving as a peer reviewer, then you are essentially engaging in unpaid labor. The incentives for being a reviewer are declining. One possibility would be to create a network of certified peer reviewers who would be appropriately compensated and could ensure greater diversity.

  16. Thanks for the comment, jecgenovese. I don’t see the labor as unpaid, or rather, I’ve always thought of it as indirectly compensated by the salaries of myself and my colleagues, which we collect in part on the understanding that we will publish in journals and anthologies; in a salaried structure where peer-reviewed publication is expected, some refereeing must therefore be done. Hm. I’ll have to think about that. Maybe yours is the right attitude and mine is misguided. Or perhaps we’re both right simultaneously, which is entirely possible!

  17. Kate, thanks so much for raising this and being honest about your negative experiences. Of which I’ve had my share too, and I know all my colleagues have too. things like a one-line rejection, papers being rejected with no explanation at all, papers rejected for not being on the right topic in the first place… I’ve seen lots of crappy behaviour from other referees too when I was a peer-reviewer and have got to see the other reviewers’ reports (I do an amount of interdisciplinary stuff and some journals, e.g. in politics, sociology, routinely use 4 or even 5 reviews). Reviewers failing to make a single constructive point, and saying the author needed to go off and discuss a bunch of authors’ work which wasn’t relevant to the topic at all, I’d say have been the two most common flaws. But not confined to philosophy, although I’m certainly convinced that philosophy could well be especially bad.
    I have some sympathy for the idea of releasing reviewers’ names to authors. For one thing, it would double up as a way of giving the reviewers a level of ‘credit’ so to speak if they do a good job, since word would presumably get around. It seems to me that there is mounting evidence from various quarters that the present system is seriously flawed. In biology, I believe, there’ve been studies finding that if errors are inserted into a paper the vast majority of referees don’t even notice it.

  18. Thanks for your thoughts, Alison. The overlooking of errors in the biology studies is interesting, because it suggests that to those referees, the purposes of refereeing are not perceived to be corrective.

    I’ve gotten messages online and offline from friends and colleagues strongly opposed to publication of referees’ names, and their arguments are often compelling. Coupled with Bastian’s findings, they move me to think refereeing, if it continues long into the future, should be a completely different sort of thing than it currently is.

  19. I think one of the worst features of peer review is the hostility that new or mildly unusual thoughts can get, whether they are anonymous or not. I initially found it impossible to publish anything in my major area of interest, metaphysics. So I turned to publishing textual footnotes from my DPhil thesis; these were in the history of philosophy and I could use textual points as something like facts. Even there, the sailing was rough. My first piece received the referee comment, “too implausible to publish.” The editor also said he thought it was implausible, but worth saying. He hoped it would be soon refuted. Instead it become something of a standard point in Hume scholarship for a short time, though not always attributed to me. It really hardly seems worth submitting to journals, at least for me. The last time I tried, my suggestion that Hume was heavily influenced by an Aristotelian-Thomistic picture of representation was accepted by one referee, met with hostile sarcasm from another. A third referee remarked “I don’t get it.” Really, who need this?

  20. Ugh. I’m so sorry, Anne. I worry about exactly this problem–if what you are saying is innovative (or even just feminist) there can be an implicit bias against the article that cannot be overcome by anonymous review. The last time I did a review where “I didn’t get it” I wrote to the editor saying that they should take my review with a grain of salt because I felt I was missing the importance (or “hook”) of the overall point and perhaps someone more thoroughly immersed in the same literature as the author would be more adept at getting inside the argument. It really bugs me when philosophers assume there is something wrong with an essay or idea that they “don’t get” instead of truly wondering what it might be that they aren’t getting.

  21. There seem to be a number of different issues and concerns bubbling up n this discussion. Just to pick out a couple: receiving a one-line or zero-line (!) rejection is, of course, poor practice and unhelpful – not just to the author but to the editors. But not receiving constructive advice? Surely we’re not expecting journals to be the equivalent of ‘finishing schools’?! If a referee gives me some such advice, I always take that as a bonus but to insist on it seems to demand too much.

    And from an editor’s point of view – its hard enough finding referees (and yes, we use all kinds of search mechanisms and go out of our way to ask early career folk etc), without putting people off by revealing who they are (some of our refs don’t even want to be identified in the annual end-of-year thanks!).

    And to end on a more positive note, here’s a blogpost on how to be decent referee, from the indefatigable assistant editor of the BJPS, Beth Hannon:

    http://thebjps.typepad.com/my-blog/2015/03/howtoreferee3.html

    (in particular what she says under point 3 might be useful).
    cheers,
    Steven

  22. Hi, Steven,

    Thanks for this link to Beth Hannon’s blogpost on how to be a decent referee! Lots of good recommendations there!

    I am curious, though, what things have been said here that warrant the description of “expecting journals to be the equivalent of finishing schools,” which I find to be an odd choice of metaphor, perhaps even (unintentionally?) a bit loaded given the association it makes between women and frivolity (again, not saying you meant to make this association, just pointing out how it resonates). People are objecting to: hostility, not actually engaging a paper’s argument, off the cuff evaluations with no reasoning in support, and unconstructive snarky comments. If an editor deems it worthwhile to send a submission to a referee (which is to say, the piece fits within the scope of the journal and is not egregiously under-prepared for review), it seems reasonable to expect that a referee can sum the argument the paper is attempting to make (as Hannon recommends) and point out places where the argument doesn’t work, could use further support, ought to engage relevant literature, etc. with reasons supporting these assessments and in a tone that is not super dismissive/snarky.

    Perhaps you are objecting primarily to the language in Cognition’s policy, suggesting referees “provide constructive advice on how to better the presentation of their research”? Since Cognition appears to be primarily a science journal (and indeed presents its policy in terms of forwarding good science), it’s hard for me to know whether this is going overboard (I am a philosopher, not a scientist), but it seems reasonable (from an outsider perspective) that if a set of authors have produced good scientific data but haven’t presented it as strongly as they could that it serves the journal and the scientific community to provide criticism to this effect without bashing the scientific work itself (which seems the point of the policy statement). In any case, that is the only place where I see anything in this thread (of over 20 comments) that appears to demand more than I would normally expect. But was there something else in this thread that you thought was expecting too much of referees?

  23. Thanks GP and apologies – I should’ve been more aware of the resonances associated with ‘finishing school’ (I was thinking more of the kind of place where I taught physics and maths during my PhD which was also called a ‘crammer’, designed to get the kids of (typically) the rich up to snuff for A- and what used to be O-levels). And you’re right, it was the excerpt from Cognition that I was reacting to although i agree this might be more appropriate for data based papers, note that Jennifer was suggesting it might be used as a model for BPA/SWIP guidelines and its in that context that my concern was raised. Basically – as Beth indicates in her blogpost – we (me, Michela, Beth) see the job of the referees as helping us, via the filter of the Associate Editor’s recommendation, in making our decision, and not, or at least not primarily, in helping the author to hone her arguments or otherwise improve the paper. If the referee can do that too, fantastic but we’re all so pressed for time these days and, again, it can be so hard to find referees sometimes, that we see such help as an optional extra and I would be uncomfortable myself with a set of guidelines that set it closer to the heart of what a referee is expected to do.

    But of course I absolutely agree that hostility, off the cuff comments, unconstructive snarkiness etc should be avoided – not least because they are utterly unhelpful to editors as well!

    cheers,
    Steven

Comments are closed.