Yesterday the journal Science published the results of the Open Science Collaboration’s effort to replicate 100 studies published in three top psychology journals (here). The results are arresting: overall, replication effects were half the magnitude of the original effects, and only 36% of replications had statistically significant results. The results were particularly bad for social psychology, for which only 14 of 55 studies were replicated (on the basis of significance testing).
The title of today’s coverage on Slate captured what seems to be a widespread reaction: “That Amazeballs Scientific Study You just Shared on Facebook is Probably Wrong, Study Says.” But is this really what the study says?
It’s worth reading the actual article in Science, rather than just the headline. For example:
- Almost none of the replications contradicted the original studies. Instead, the effects of many of the replications were significantly weaker than the original effects. The replication efforts don’t therefore tell us that the findings of any particular study that didn’t replicate were false. Rather, it tells us that the evidence for those findings being true is considerably weaker than we might have thought.
- It appears that the best predictor of replication success for any particular study was the strength of the original findings, rather than the perceived importance of the effect or the expertise/reputation of the original research team. In addition, surprising effects were less reproducible (surprise!), as were effects that resulted from more difficult/complicated experimental scenarios.
- This is not a problem in psychology alone. It has been reported that in cell biology, only 11% and 25% of landmark studies recently replicated. Moreover, there may be good reasons why social psychology studies are harder to replicate than other studies in psychology. As Simine Vazire points out (here), the phenomena social psychologists study are extremely noisy. She writes, “if we still don’t know for sure, after years of nutrition research, whether coffee is good for you or not, how could we know for sure after one study with 45 college students whether reading about X, thinking about Y, or watching Z is going to improve your social relationships, motivation, or happiness?” That said, the Science study points out other reasons why social psychology studies were particularly unlikely to replicate: social psychology journals have been particularly willing to publish under-powered studies with small participant samples and one-shot measurement designs.
There is, of course, something very unsettling about these findings. But in the big picture it seems to me that this article is a testament to science working well. (Or, maybe, like Churchill said of democracy, it is a testament to science being the worst form of inquiry . . . except for all the others.) The fact that one of the most important scientific journals has published this article is itself confidence-inspiring. Vazire quotes Asimov saying that “the point of science is all about becoming less and less wrong.” Or as the Science article puts it:
“After this intensive effort to reproduce a sample of published psychological findings, how many of the effects have we established are true? Zero. And how many of the effects have we established are false? Zero. Is this a limitation of the project design? No. It is the reality of doing science, even if it is not appreciated in daily practice. Humans desire certainty, and science infrequently provides it. As much as we might wish it to be otherwise, a single study almost never provides definitive resolution for or against an effect and its explanation. The original studies examined here offered tentative evidence; the replications we conducted offered additional, confirmatory evidence. In some cases, the replications increase confidence in the reliability of the original results; in other cases, the replications suggest that more investigation is needed to establish the validity of the original findings. Scientific progress is a cumulative process of uncertainty reduction that can only succeed if science itself remains the greatest skeptic of its explanatory claims.”
PS – good coverage from The Atlantic