A new study published in Proceedings of the National Academy of Sciences reports that men and women are not equally receptive to experimental evidence of gender bias in STEM settings. Ian Handley and colleagues reported the results of three experiments. In the first and second experiment, men and women read an actual article abstract from a peer-reviewed scientific journal, which was accompanied by the article’s publication information and the first author’s full name. In the first experiment, participants were M-Turk workers; in the second, they were male and female STEM and non-STEM faculty. The abstract used in experiments 1&2 was from Corinne Moss-Racusin and colleagues’ (2012) PNAS article reporting gender bias in science faculty’s hiring decisions. In the first experiment of the new Handley study, men were significantly more likely than women to evaluate the abstract negatively. In the second experiment, male faculty in STEM departments displayed the same pattern; they evaluated the Moss-Racusin et al. (2012) abstract more negatively compared with female faculty in STEM departments. Amongst non-STEM faculty, men and women gave comparable evaluations. Finally, in the third experiment, Handley and colleagues replicated the main effect using a different abstract (from Knobloch-Westerwick et al. (2013)), which reports gender bias in reviews of scientific conference submissions. However, when the authors altered the abstract to report no gender bias, they found that women evaluated it more negatively than men.
This study has some obvious implications. The authors focus on the worry that no amount of evidence attesting to pervasive gender biases will be sufficient to convince skeptics, if gender biases are affecting skeptics’ assessments of that evidence.* They also discuss potential mechanisms driving these effects, in particular the idea that male faculty in STEM departments might find evidence of gender bias (perhaps implicitly) threatening (in accord with “Social Identity Theory”). More research on this is clearly needed.
What I want to consider briefly is the notion of “bias” at work in this study, and in coverage of it. David Miller, for example, describes the third experiment as showing that “women have their own biases” (here). Commenters have made similar points on Facebook. This is understandable, and is certainly true as a general point, since all human beings have biases, and women are human beings! Handley and colleagues saw a clear reversal in evaluations; when the abstract(s) reported gender bias, men were harsher, and when the abstract(s) reported no gender bias, women were harsher. The authors themselves point out that “individuals [not just men] are likely to demonstrate a gender bias toward research pertaining to the mere topic of gender bias in STEM” (3). One reason they conclude this is that the biases they detected were only relative to each other. There was no condition controlling for the effect of gender on participants’ evaluations.
However, it seems only right to conclude that both men and women are biased in these particular findings if there is no means to independently assess the quality of the evidence in the abstracts.** If it is true, though, that gender bias is pervasive in the domains described in the study materials, then women who give positive evaluations of studies finding gender bias, and negative ratings to studies not finding gender bias, are accurate, not biased.***
Similarly, if we presume that women (especially female STEM faculty) are more informed about research on gender bias than men, then we might give their abstract evaluations more credence.**** I’m grateful to Alex Madva for this point, who suggests an analogy: if a group of climate scientists negatively evaluated abstracts denying the existence of climate change, and a group of people who are not climate scientists rated the same abstracts positively, would we conclude that “everyone has their biases?”
Thanks to Alex Madva, Daniel Kelly, and Jennifer Saul for helpful suggestions on this post.
*Jennifer Saul has discussed similar concerns about the effects of implicit biases here.
** How might researchers at least approximate an assessment of the abstracts independent of rater-gender? Perhaps a team of independent mixed-gender reviewers? Or an average of all reviews, against which the ratings of men and women could be compared separately? Or simply compare the evaluations of abstracts by gender against the results of a meta-analysis of similar studies?
***Of course, gender bias could be truly pervasive in these domains, and it still be the case that any one study purporting to demonstrate gender bias is low quality. Note, though, that study participants were only asked to evaluate their agreement with the authors’ interpretation of the results in the abstract, the importance of the research, how well-written the abstract was, and what its overall quality was. If one believes that gender bias is pervasive, and reads an abstract reporting gender bias, one is likely to give positive answers to these questions. (Moreover, participants’ answers to these 4 questions were highly correlated, suggesting that they were answering based on an overall sense of the accuracy of the study’s findings.) Perhaps this is a limitation of the Handley et al. study. It would be interesting to find out if asking other questions would affect the results, such as “how rigorous do you think the study’s methodology is?” or “how much does this data contribute to the overall case for finding gender bias (or its absence) in STEM fields?”
****The authors did examine whether the amount of experience a person has had with gender discrimination correlated with their evaluations of the abstracts. (These data are found in the supplementary materials.) For women, they found no correlation. Interestingly, for men, they did find a correlation. The more (“reverse”) gender bias men reported having personally experienced, the more harshly they rated the abstracts.