About eight years ago, a doctor in Turkey examined a 5-month-old boy for “failure to thrive and dehydration.” Paradoxically, his diapers were wet, so the medical team was inclined to suspect Bartter syndrome, a congenital kidney defect which is manageable if caught early. But the standard treatments weren’t working. Baffled, the doctors sent the infant’s blood sample to Yale for a sophisticated analysis then under development called exome sequencing.
While the Human Genome Project and related work had looked at the entire genetic code and generated an unruly amount of data, exome sequencing detailed only the region of the genome—about 1 percent—that codes for proteins. The technique was pioneered by School of Medicine researchers Shrikant M. Mane, Ph.D., director of the Yale Center for Genome Analysis (YCGA), and Richard P. Lifton, M.D., Ph.D., former chair of genetics who’s now president of Rockefeller University, and their colleagues. Exome sequencing, says Mane, “gives you just about everything you need for diagnostic purposes, and quickly we knew this wasn’t a kidney problem. The doctors had been barking up the wrong tree.”
Exome sequencing revealed a mutation in gene SLC26A3, which leads to a condition called congenital chloride diarrhea. While the condition can’t be altered, prompt salt replacement therapy helped bring the baby back from the brink. “This was the first use of the exome sequencing technique as a diagnostic tool,” says Mane, who delights in explaining that his uncle in India underwent whole-exome analysis—and when Mane was trying to figure out the source of his own recent mysterious malaise, he also undertook the sequencing equivalent of a selfie. “Now, the whole world is using it.”
Since a seminal paper on this case appeared in 2009 in the Proceedings of the National Academy of Sciences, Mane and his colleagues have employed exome sequencing to zero in on the exact mutations responsible for a disparate array of ailments from severe brain malformations to unusual kinds of melanomas. The technique is rapidly becoming a key component in the toolkit that health care workers and researchers are using to achieve a long-standing dream: precision medicine.
In a presentation last February to the Connecticut Commission on Economic Competitiveness and the state legislature’s Commerce Committee, Dean Robert J. Alpern, M.D., Ensign Professor of Medicine, explained that precision medicine uses “a patient’s genomic information, environment, and lifestyle to assess a person’s risks for disease and to develop more effective and targeted treatment plans and therapies.” Doctors will be able to probe a patient’s DNA to determine in advance what will and won’t help an individual, rather than the general population. “Ideally, we’ll also be able to develop an alternative therapy that worked for the non-responders,” says Alpern. “This is true precision medicine, and while today it has had some applications, for many conditions it remains a dream. But it is a dream that will soon be realized.”
Perhaps paradoxically, becoming more precise has required School of Medicine researchers and physicians alike to adopt an ever broader, more interdisciplinary approach to their work. Not only are investigators in the basic medical sciences, from pathologists and cell biologists to immunologists, collaborating with an array of front-line clinicians, but they’re also sharing an array of new tools and working closely with investigators throughout the entire university, from physicists and chemists to mathematicians and computer scientists, to lay the groundwork for the move to a new kind of medicine.
One way this is being achieved is through the YCGA, a joint endeavor among the university, the School of Medicine, and Yale New Haven Hospital. Established in 2009, the YCGA opened its new headquarters on West Campus in May and is using the highest of high-tech genome sequencers and computers to investigate the genetics of rare inherited diseases, uncover mutations that can help doctors diagnose ailments, and—this is the ultimate hope—discover individualized ways to deal with often baffling, even intractable, situations.
A multimillion-dollar array of state-of-the-art machinery and analysis equipment has been critical in bringing costs within reason. Mane noted that the Human Genome Project, the federal effort to sequence the complete genetic code, took some 10 years and $3 billion to complete; it was wrapped up in 2003. Yale geneticist Jonathan M. Rothberg, Ph.D. ’91, FW ’93, invented high-throughput sequencers that reduced the time considerably and lowered the cost of sequencing a human genome to about $1 million—subsequent technologies have dropped this expense to a few days of analysis time and a “mere” $1,000. Sequence only the exome, says Mane, and the work could be done in close to real time for approximately $275—about the cost of a routine office visit. “That’s why this revolution is starting to take place,” says Mane. “Sequencing has become so cheap that we’re poised to take a quantum leap in our ability to use it routinely.”
There are, however, roadblocks—some technical or institutional, others structural or philosophical—that will have to be addressed. Among them is the challenge of locating a mutational needle in the genetic haystack. “It has become relatively easy to sequence a genome, but it’s still very hard to determine the genetic cause of a disease,” says Mane. Part of the reason for the difficulty comes from the fundamental but surprising insight provided by the Human Genome Project: we simply don’t have that many genes.
“We used to believe that it was one gene, one protein, so if there were 100,000 proteins, there had to be 100,000 genes,” Mane says. “But we now know that we have only about a quarter of that number. In fact, we have fewer genes than a rice plant.”
The smaller number, however, makes life harder, not easier, for researchers, since those 20,000 genes are expert multitaskers, which makes identifying all their responsibilities especially challenging. Uncovering which mutation leads to what ailment requires sequencing numerous individuals and keeping the YCGA’s pair of NovaSeq 6000s and related machinery working overtime. Building such genetic profiles and sorting the good from the bad also requires the analysis of an almost unfathomable amount of data and the development of new techniques to mine and protect them. That is the mandate of a newly formed entity called the Yale Center for Biomedical Data Science.
Center co-director Mark B. Gerstein, Ph.D., the Albert L. Williams Professor of Biomedical Informatics, explains that succeeding with what researchers term “Big Data” requires “real thought about standards, the uniform collection of data, the distribution of samples, and the presentation and packaging of results.” After three years of planning, Gerstein and co-director Hongyu Zhao, Ph.D., a geneticist and the Ira V. Hiscock Professor of Biostatistics, have assembled a kind of central clearinghouse for research and development of these issues, particularly cloud computing and privacy, as well as for education and bridge-building collaboration on university, national, and international levels. “Our mission is really about connecting and coordinating the people and resources already here, and becoming a way to recruit the scientists we want to attract in the future for the Big Data initiatives we want to participate in,” says Gerstein. “We expect the center to have a very broad impact.”
Tamar S. Gendler, Ph.D., dean of Yale’s Faculty of Arts and Sciences, the Vincent J. Scully Professor of Philosophy, and professor of psychology and cognitive sciences, concurs. At the university, she explains, data science encompasses three interlocking circles that range from the most abstract—pure mathematics—to the most applied, the clinical. “What’s exciting is the often unexpected ways that the math informs the physics, which informs the chemistry and the biology and the clinical work,” says Gendler, pointing to the work of Ronald R. Coifman, Ph.D., the Phillips Professor of Mathematics. His fundamental insights enabled precise information organization methods, which, due to a collaboration with Frederick J. Sigworth, Ph.D. ’79, professor of cellular and molecular physiology, of biomedical engineering, and of molecular biophysics and biochemistry, led to remarkable enhancements in the images produced by the cryo-electron microscope. Those images can reveal the basic structure of molecules that may be important in understanding diseases and developing targeted therapies. “This device allows atomic structures to be determined from a smaller number of molecules—a millionfold smaller—compared to the more traditional method of X-ray crystallography. And the ability to obtain atomic structures with cryo-EM happened quickly—about five years from the initial theoretical math to an actual insertion into a scientific tool,” says Gendler.
Steven H. Kleinstein, Ph.D., associate professor of pathology, is using the cutting-edge sequencing tools to better understand how the immune system responds to pathogenic challenge, as well as to uncover the roots of autoimmune disorders like myasthenia gravis. Kleinstein targets the body’s 100 billion B cells, a key component of the immune system, to discover the characteristics that enable each cell’s antibody receptors to recognize and fight off pathogens. “This is a powerful technique that lets us understand the dynamics of the process,” says Kleinstein. “We can use these data to reconstruct a person’s unique immunological history. This helps us understand the processes that led to a disease, or may eventually help us design better vaccines that can leverage an individual’s current immune state to get the exact response we want. Receptor sequencing is already being used as a personalized biomarker for certain kinds of tumors and can detect, with much greater sensitivity than established methods like flow cytometry, if the disease is coming back. This kind of precision medicine is no longer a pie-in-the-sky idea. We’re going to get there.”