Skip to Main Content

Building a biobank … a million veterans at a time

Yale Medicine Magazine, 2015 - Spring


The exhortation to “enlist” wasn’t as dramatic as that of the famous “I Want You” poster featuring a red-, white-, and blue-clad Uncle Sam, but William Koob, 71, a seaman’s apprentice in the early 1960s, found the call almost as compelling. Koob, a bank auditor, was semi-retired and working at the VA Connecticut Healthcare System campus in West Haven when he heard about an initiative called the Million Veteran Program (MVP).

Launched by the Department of Veterans Affairs in 2011 and co-directed by a Yale faculty member and a School of Medicine alumnus, the MVP hopes to recruit a million veterans over the next five to seven years. Genetic information and electronic health histories of that “mega-cohort” of vets will reside in an enormous biobank for scientists and clinicians exploring the connections among genes, the environment, military service, and disease.

To date, said John Concato, M.D., M.S., M.P.H. ’91, professor of medicine and one of the project’s two principal investigators, nearly 350,000 veterans have signed up. Koob was one of them.

“All MVP required was giving a blood sample, allowing my VA health record to be accessed, and filling out health questionnaires—and I quickly realized how valuable this information could be,” said Koob. “It was another opportunity to serve, and I was happy to contribute.”

The project, it turns out, is not the only endeavor looking for a million good men and women. As part of his 2016 budget, President Barack Obama announced in January that he wanted to invest $215 million to create the Precision Medicine Initiative (PMI): a massive biobank—compiled from both emerging and existing databases (possibly including data from the MVP)—of genetic information and electronic medical records of a representative sample of U.S. citizens, a million strong. The PMI, said the president, could “lay a foundation for a new era of lifesaving discoveries.” (Richard P. Lifton, M.D., Ph.D., chair and Sterling Professor of Genetics, has been named co-director of the PMI.)

The president’s initiative, if funded, would allow researchers to cull genetic secrets about the origin and development of physical and mental disorders—insights that might lead to more custom-tailored and cost-effective treatments, particularly for cancer. The MVP, while its goals are similar, is targeted toward a more specific population—a group whose “can do” attitude and eagerness to enlist is typical, said Concato, who directs the VA’s Clinical Epidemiology Research Center in West Haven and has conducted extensive research on prostate cancer epidemiology and screening. “When you consider that this project is not going to benefit any of the participants directly,” he said, “their altruism is very impressive.”

None of the information an individual provides, including the fruits of the DNA samples, goes into the veteran’s personal medical record. Instead, this nationwide “big data, big science” effort will allow researchers and clinicians “to better identify those at risk of getting a disease, those who might be prone to do worse with that particular ailment, and what medications might work best as treatment,” said Concato. “This is the promise of personalized medicine, and MVP stands to greatly improve our ability to provide it.”

Exploring the genomic universe

It’s too early to know what discoveries might be made, say Concato and the other principal investigator, J. Michael Gaziano, M.D. ’87, M.P.H., a professor of medicine at Harvard. Advances in understanding the interplay of genes and such illnesses as cancer, diabetes, and heart and circulatory problems, and such mental illnesses as service-related post-traumatic stress disorder (PTSD) are possible long-range program benefits.

“Just like the Hubble Space Telescope opened up ways of seeing the universe that Earth-bound telescopes couldn’t provide, we think that MVP will allow us to explore the genomic universe,” said Gaziano, scientific director of the Massachusetts Veterans Epidemiology Research and Information Center, chief of the division of aging at Brigham and Women’s Hospital, and director of the VA Boston Healthcare System’s Geriatric Research and Education Center.

The MVP is one of several similar efforts around the world, among them biobanks in the United Kingdom and China, each of which has about 500,000 participants, and at Vanderbilt University and Kaiser Permanente, which include about 175,000 and 200,000 people, respectively. The MVP, if it meets its target, will be the largest among them.

All these efforts link at least two streams of genetic and medical data; the MVP links three. The first is genomic information gleaned from different DNA sequences: general genotyping; the sequencing of just the coding part, or exome, of the genome; or, in a small number of cases, whole-genome sequencing. The second stream includes health and treatment information from each vet’s electronic medical record. The third part, unique to the MVP, is a health and lifestyle survey that includes questions about military service which will give researchers a better understanding of possible genetic underpinnings of conditions like PTSD that might affect people in the armed forces.

“MVP couldn’t have been built without the existing VA research infrastructure,” said Concato, who, with Gaziano, directs a staff of about 20 in West Haven and Boston, with key support from the Office of Research and Development in Washington, D.C.

One key to the program’s eventual utility is the VA’s reach: data are being collected at about 50 VA health care facilities around the country, each of which has two staff members dedicated to the MVP. Another is the availability of its pioneering electronic medical record, which can look backward and forward in time and follow vets wherever they live. A third aspect is data security—the VA made security the highest priority from the moment it started its EMR implementation. Last, recent developments in information technology, from automated biostorage data retrieval to computer analytic capabilities, allow researchers to manage massive amounts of data.

“Maybe the biggest enabling technology is having a chip that allows us to do cheap genetic testing,” said Gaziano. “We can now look at the genetic variation in about 750,000 places on a person’s genome for $75—that’s $52 for the chip and the rest for processing—and this ability to generate data, coupled with the information we’ll get from hundreds of thousands of vets, will let us start to answer questions we couldn’t have even asked before.”

One area of interest is an aging population. “We hope that MVP will be a pluripotent resource that can be used by researchers and providers to look at how to best deal with that group, from studies of frailty to understanding variations in medication-related enzyme metabolism,” said Gaziano.

At the beginning of a revolution

Like Concato, Gaziano is no stranger to big-data projects. “I’ve been working in the large cohort business since I arrived at Brigham and Women’s in 1988,” he said. He’s had leadership and research roles in the Brigham and Women’s-based Physicians’ Health Study, which has followed a cohort of more than 30,000 doctors in the United States to examine a variety of health questions, as well as the Nurses’ Health Study and the Women’s Health Study, both of which are affiliated with Brigham and Women’s and have enrolled cohorts of more than 100,000.

“While it builds on earlier work,” Gaziano said of the MVP, “it’s really something new.”

Early epidemiological studies were primarily descriptive and involved small numbers of subjects or issues, such as the relationship of smoking to lung cancer. The famous Framingham Heart Study (FHS) in Massachusetts has since 1948 tracked 5,000 adults to ask fundamental questions about the epidemiology of cardiovascular disease. The FHS used multivariate computer modeling to examine the impact of a half-dozen risk factors.

But with 20,000 to 25,000 protein-coding genes in the human genome, a cohort needs hundreds of thousands of people so that scientists can determine “how genes interact with other genes and environmental factors to contribute to health and disease,” said Gaziano. “There are enormous numbers of variables, so we need mega-cohort studies like MVP to solve the signal-to-noise problem that genetics gives us. We’re at the beginning of a very exciting revolution.”

Two “alpha” studies have begun testing the biobank’s ability to deliver useful information. Concato, Gaziano, and their colleagues have mined the database for mentally healthy vets who can serve as a control group for comparison with 9,000 vets with schizophrenia or bipolar disorder. “The pending genomic analyses can advance our understanding of the etiology of, and treatments for, two major psychiatric disorders,” the researchers wrote last year in the American Journal of Medical Genetics.

The second alpha test, co-directed by Joel Gelernter, M.D., the Foundation’s Fund Professor of Psychiatry at Yale, is using the biobank to find a cohort of vets with PTSD to compare with an MVP-assembled control group without the disorder. The hope is to identify genes that increase the risk of this often-devastating condition and to develop more effective methods of detecting and treating it.

Last year, MVP leaders issued a nationwide call for proposals, said Concato. About 20 have survived the first review round, and those approved will investigate an array of topics from heart disease and lung cancer to mental health disorders and problems with water metabolism. Funding decisions will be made later this year.

“It’s important to remember that MVP is an infrastructure project,” said Concato. “Our role is to help design the program so that researchers can use the biobank effectively, securely, and ethically, and have the right information available to ask the right questions.”

Bruce Fellman is a freelance writer in North Stonington, Conn.

Previous Article
All welcome Dumbledore!!!
Next Article
Alum’s appointment as surgeon general a “home run”