With millions of people’s biomedical data at hand, solutions to long-standing medical puzzles are tantalizingly close. Such large amounts of data are enabling researchers to learn more about specific diseases—even very rare ones—as well as how diseases present differently across individuals, which could inform precision medicine efforts.
But for all their promise, these data come with the responsibility to protect people's privacy.
Biomedical data are stored in different repositories and managed by different groups. Combining data may allow for stronger analyses and could uncover new insights, but pooling separate datasets would both undermine the agreements made with those who shared their data and put their privacy at risk. Researchers, therefore, must navigate the challenges of unlocking the true potential of all this information while upholding data security.
Hyunghoon (Hoon) Cho, PhD, assistant professor of biomedical informatics and data science at Yale School of Medicine, is on the case. Cho’s team harnesses cryptographic, computational, and biomedical knowledge to create faster and more accurate research tools without compromising privacy.
We spoke with him about biomedical data security risks, working across data repositories to gain insight into health conditions, and his recent study on the topic published in Nature Genetics.