It is estimated that more than $200 billion is spent on biomedical research every year. The return on that investment is too low, agree a group of experts who gathered at the Yale School of Public Health to discuss reproducibility and transparency in research last week.
Ninety-six percent of biomedical research literature that uses P-values claims statistically significant and almost ubiquitously novel results. That is not an accurate reflection of research findings, said John Ioannidis, professor and co-director of the Meta-Research Innovation Center at Stanford. This is driven by what is asked from researchers by both journals and universities who base rewards on producing statistically significant results in a “publish or perish” mentality. As a result, young scientists are often siloed and do their research without interdisciplinary teams that include statisticians and data management professionals. The pressure to publish leads some to cherry pick hypotheses from existing datasets to ensure positive results. Half of all research results are never published and what is published is often not transparent on whether it was exploratory or prespecified.
“Science is the best thing that ever happened to human beings,” said Ioannidis, “but we can get more out of it.” Schools of public health are uniquely positioned to train both scientists and the workforce to new standards for reproducibility across scientific disciplines, he added.
“Is this a problem or an opportunity for schools of public health?” asked Sten Vermund, dean of the Yale School of Public Health, in his remarks.
David Allison, dean of the Bloomington School of Public Health, Indiana University, is particularly keen on transparency and acknowledgement of data errors. “When you’re wrong, you should say so; I learned that from a three-year old.” Much of Allison’s work aims to bring one of his highest personal values — truth — to science. He said, it’s not just a matter of bringing one’s best effort and integrity to the work, but also a matter of training and infrastructure. Clinicians and biomedical researchers often only have one or two analytical classes in their training, therefore, professional statistical expertise should be part of grants’ allocations to prevent data errors and poorly designed studies. Furthermore, editors and authors need to be forthcoming about correcting, or retracting, articles that have been shown to be incorrect.
“People expect scientific findings to be black and white. This is the era of probalistic medicine and that requires a different construct,” said Harlan Krumholz, Harold H. Hines, Jr. Professor of Medicine at Yale and Director of the Yale-New Haven Hospital Center for Outcomes Research and Evaluation.
Krumholz believes it should be obligatory that all clinical trials be registered and their results published. Currently, four years after the close of grants only 50 percent of studies’ results are published. He advocates for funding and Institutional Review Board approvals to be predicated on this level of transparency and called for a declaration of principles from schools of public health.
Donna Spiegelman, professor of epidemiological methods at the Harvard School of Public Health and chief statistician for the Nurses’ Health Study, offered a set of practices that she employs to maintain the quality and reproducibility of findings from the dataset tracking health, lifestyle and nutritional factors for three generations of nurses.
Other speakers included Victoria Stodden, associate professor at the School of Information Sciences at the University of Illinois, who spoke about computational reproducibility in the era of Big Data; Joseph Ross, associate professor of medicine and health policy at Yale, who described the Yale University Open Data Access (YODA) Project, a repository for clinical research datasets; Limor Peer, associate director for research at Yale’s Institution for Social and Policy Studies which runs a data repository for work in political science; David Jett, program director for the Division of Translational Research and director of Countermeasures Against Chemical Threats Program at the National Institutes for Health (NIH), who spoke on the increasing weight transparency and reproducibility carry by both funders and regulatory agencies; Kevin Boyack, president of SciTech Strategies who discussed the potential to develop new research and reproducibility indicators; Kate Nyhan, Yale’s public health librarian; and Melinda Pettigrew, senior associate dean for academic affairs at the Yale School of Public Health, who opened a discussion on how these principles should be addressed in education.
Among the ideas for schools of public health to implement were: embedding principles of transparency and reproducibility into existing curriculum, making data documentation part of theses and dissertations, teaching students to annotate their code and document datasets, giving students research experience with real data in an environment that fosters high methodological standards, for a consortium of schools to develop a shared curriculum on research transparency and reproducibility, expanding exposure of early-career researchers to peer research groups and role models, improving infrastructure for data management, and lastly, restructuring faculty promotion and incentives to foster cultures of teamwork to advance science more profoundly.
The conference was organized by Joshua Wallach, postdoctoral research fellow for Collaboration for Research Integrity and Transparency (CRIT) and the Yale-New Haven Hospital Center for Outcomes Research and Evaluation (CORE), Hospital; Vasilis Vasiliou, the Susan Dwight Bliss Professor Epidemiology and chair of the Department of Environmental Health Sciences; and John Ioannidis.