As big data becomes integral to many academic disciplines, research universities have found a need to upgrade both the technologies they use and the skill sets of research professionals who must organize and analyze the data. It was this need that motivated the creation of the Yale Center for Research Computing (YCRC) in 2015, says Kiran Keshav, E.M.S., the center’s executive director, and senior director of research technologies at Yale University.
The YCRC provides Yale researchers a resource for complex computing support. Located on Yale’s Science Hill, the center provides the cyber-infrastructure researchers need to do their work and guidance on how to maintain the infrastructure. It also provides education and training, such as programming. Before the center’s creation, Keshav says, support for computational research was decentralized. “One of the first things I wanted to do was to collocate all the staff. All the people who were effectively doing research computing support for faculty needed to be together,” he says. “Now it’s the start of a community. We’re building a one-stop shop for technology-related support for research.” The YCRC has supported researchers in numerous ways. Alan Anticevic, Ph.D., assistant professor of psychiatry and psychology, uses computational methods combined with imaging techniques to better understand the mechanisms underlying such psychiatric illnesses as schizophrenia and addiction. Where today clinicians diagnose these illnesses using qualitative measures such as behavior, Anticevic predicts that one day they will be able to diagnose with far more precision by measuring associated brain mechanisms. He has used the high-performance computing resources at the YCRC to investigate these dysfunctional brain circuits.
The YCRC has also helped to acquire new super-computing technology for Yale researchers. Robert Bjornson, Ph.D., senior research scientist in Yale’s Department of Computer Science and a member of the YCRC staff, recently assisted the Yale Center for Genome Analysis in securing a grant from the National Institutes of Health to replace an old high-performance computing cluster. The newly purchased cluster went online this spring, bringing an additional two petabytes of storage and a great deal more computing power for genome analysis. The new cluster was named Ruddle, after the late Francis H. (“Frank”) Ruddle, a School of Medicine scientist who famously pioneered genetic engineering.
These clusters are used by people such as Mark B. Gerstein, Ph.D., the Albert L. Williams Professor of Biomedical Informatics, who is working to identify the function of particular regions of the human genome. As sequencing the human genome becomes increasingly accessible, researchers are using the technology to better understand disease. Structural changes along the genome are prevalent in genomic diseases such as cancer.
“People in genomics were using big data before it was cool,” says Gerstein, who also is professor of molecular biophysics and biochemistry and of computer science. As a genomic researcher, Gerstein needs to handle very large datasets and organize the data in a way that will provide meaningful insights in medicine. He says that research computing support should be separate from a general information technology department, and is glad to work with the YCRC.
“Configuring the hardware, knowing what to get, doing everything correctly in relation to federal grants and contracts—that takes quite a bit of effort on the part of everyone. The YCRC are the point people to help get those things working,” says Gerstein.
Genomics may be an obvious beneficiary of the resources at the YCRC, but research in fields such as biomedical engineering also depends on the center’s computation resources. Jay D. Humphrey, Ph.D., the John C. Malone Professor of Biomedical Engineering, is interested in understanding how blood flows through the complex vasculatures of patients with abdominal aneurysms. Using patient-specific images of aneurysms, Paolo Di Achille, a doctoral candidate in Humphrey’s lab, creates computer models that can predict where a blood clot will form within an aneurysm. This work could help clinicians decide whether or not to intervene when a patient has an aneurysm and there is risk that a blood clot will form.
Humphrey’s team uses the supercomputers available through the YCRC, such as the clusters “Omega” and “Grace.” They also use clusters in such places as Texas and San Diego through a National Science Foundation-funded consortium called Extreme Science and Engineering Discovery Environment. Di Achille first learned how to use supercomputers through a YCRC-run workshop. “Having [the supercomputers] here and having some practice on them allows me to quickly adapt the workflow to clusters somewhere else,” he says.
Humphrey’s research on blood clots and abdominal aneurysms is “computationally expensive”—meaning it requires large amounts of computing power. He notes the value of having a center on campus dedicated to research computing support. “It’s more than just maintaining hardware or having the right software available,” he says. “It’s really about understanding what’s needed to do state-of-the-art computation and enabling the people who use the facility to be able to do it in an efficient way.”
Medical students should also be prepared to deal with the era of big data, according to the co-chair of the center’s faculty advisory committee, Harlan M. Krumholz, M.D., S.M., the Harold H. Hines Jr. Professor of Medicine and professor of investigative medicine and of public health. “With medicine, this [data] is the next big thing. We think discoveries are going to be accelerated by our better use of digital data.”
Krumholz says the old approach of memorizing risk factors to categorize patients may be on its way out. “I think we’re going to move toward taking all the information about you and be able to see how it affects your risk and response to disease and treatment—being able to personalize our approach in ways doctors could never memorize.”
The center isn’t just for the people who want to be more computer savvy in research, Krumholz says. “This center should be an organizing force. I think we will have been successful if this kind of training becomes an integral part to every different part of the university.”
Keshav and his colleagues hope to expand the YCRC’s training opportunities and continue hosting events. For now, the center continues to provide support to the growing need for technological support in research.