Over the past several decades, the lack of reproducibility in biomedical research has become increasingly concerning, particularly in the drug development process.
Image Credit: Rattiya Thongdumhyu/Shutterstock.com
The evolution of scientific research
The study of science has undergone a dramatic transformation from work that was originally conducted by Aristotle, who is often considered to be the first scientist, to that which is conducted today by hundreds to thousands of scientists worldwide. In addition, the number of both scientific and engineering fields and subfields of research has also expanded significantly to a total of more than 230 different disciplines.
As scientific discoveries have been made across these numerous disciplines, the volume of published literature has also become massive. In 2016 alone, which is the most recent year for which data are available, almost 2.3 million scientific and engineering research articles were published worldwide.
Defining reproducibility and replicability
Despite these dynamic changes, the need to critically assess the accuracy of scientific claims has remained essential. While this may be true, different scientific disciplines and institutions will use the terms ‘reproducibility’ and ‘replicability’ in inconsistent ways.
In general, any experiment that is performed and discussed by scientists, as well as the subsequent data analysis processes, should be described in precise detail. This description should therefore allow other scientists with sufficient skills to follow the same steps described in the published work to reach the same results within the margins of experimental error.
The replication crisis in biomedicine
Over the past several decades, concern has been growing on the low success rate of replication studies within the fields of social, biological, and medical sciences. In social psychology, for example, rigorous replication studies have failed to replicate many previously published scientific findings. In fact, a 2015 large-scale replication study included 100 experiments from three high-ranking psychology journals just to find that only between 30% to 50% of the original findings were observed in the repeated studies.
Several different factors are believed to contribute to the reproducibility issues encountered in biomedical research. For example, the interpretation of experiments often relies on probabilities (P-values) of <0.05, which is considered to be the standard test for statistical significance. While some scientists have called for the P-value significance threshold to be reduced from 0.05 to 0.005, others believe that lowering the significance threshold could instead increase the false positive rate, rather than lower it to improve reproducibility.
Aside from the problem of statistics, a wider problem of the descriptions provided for experimental methods and environmental conditions has also been suggested to prevent studies from being reproducible. To this end, many scientists believe that method descriptions are not provided in enough detail to allow other scientists to recreate the original setup accurately. This lack of information, therefore, causes replication teams to devote many hours to finding the appropriate protocols and reagents that could be incorporated into the original study.
The biology itself may also contribute to the replication failures in repeated scientific studies. For example, studies that are investigating the effect of a drug in animal models may vary based on the metabolic or immunological state of the animal at the time of the experiment.
Redefining what is reproducible/replicable
To overcome some of the challenges that have been discussed surrounding the replication crisis in biomedicine, Brian Nosek and Timothy Errington of Virginia recently proposed an alternative definition for replication that would be more inclusive of all scientific disciplines.
To this end, Nosek and Errington define replication as any outcome that would be considered diagnostic evidence about a claim from the previous research. By reducing the emphasis on operational definitions included in the original study, this new definition instead allows scientists to focus more on how to interpret all possible experimental outcomes.
Nosek and Errington take this definition further by including two aspects that must remain true for a study to be considered replicable. To be considered replicable, the outcomes of the repeated experiment must remain consistent with those stated in the prior study and, as a result, increase confidence in this claim.
By using this approach to replication, any outcome that is consistent with a prior claim would be deemed successful and subsequently increase confidence in that model. Conversely, unsuccessful replications would reduce the confidence in the model and allow scientists to determine whether those models should be altered or discarded altogether.
Is there a reproducibility crisis in science? - Matt Anticole
Conclusion
While this approach aims to create an approach to replication that is more relevant for advancing scientific knowledge, Nosek and Errington also recognize that reasoning biases could cause scientists to quickly dismiss successful and unsuccessful replications as successes and failures, respectively.
Any uncertainty about the status of claims should be met with what they consider as science’s most effective solution, which is another set of repeated experiments. Overall, replication should not be considered as an exact entity, and should instead be approached as a method to identify conditions that are sufficient to confront prior claims with new evidence.
References
- Understanding Reproducibility and Replicability [Online]. Available from: https://www.ncbi.nlm.nih.gov/books/NBK547546/.
- Plesser H. E. (2018). Reproducibility vs. Replicability: A Brief History of a Confused Terminology. Frontiers in Neuroinformatics, 11, 76. doi:10.3389/fninf.2017.00076.
- Nosek, B. A., & Errington, T. M. (2020). What is replication? PLOS Biology. doi:10.1371/journal.pbio.30000691.
- Wood, S. M. W., & Wood, J. N. (2019). Using automation to combat the replication crisis: A case study from controlled-rearing studies of newborn chicks. Infant Behavior and Development 57. doi:10.1016/j.infbeh.2019.101329.
- Williams, C. R. (2019). How redefining statistical significance can worsen the replication crisis. Economics Letters 181; 65-69. doi:10.1016/j.econlet.2019.05.007.
- Hunter, P. (2017). The reproducibility “crisis” EMBO Reports 18(9); 1493-1496. doi:10.15252/embr.201744876.
Further Reading