Skip to Main Content

Ira Hall, PhD

Professor of Genetics, Director of the Yale Center for Genomic Health
DownloadHi-Res Photo

About

Titles

Professor of Genetics, Director of the Yale Center for Genomic Health

Biography

Dr. Hall's research career spans the fields of genetics, genomics, bioinformatics and data science. He received a B.A. in Integrative Biology from the University of California at Berkeley (1998), and worked as a technician for 2 years in Sarah Hake's plant genetics group at the USDA/ARS Plant Gene Expression Center. He received his Ph.D. in genetics from Cold Spring Harbor Laboratory (2003), where his work in Shiv Grewal's laboratory established the first direct link between RNA interference and chromatin-based epigenetic inheritance. As a postdoc with Michael Wigler (2004) and independent Cold Spring Harbor Laboratory Fellow (2004-2007), Dr. Hall used microarray technologies and mouse strain genealogies to conduct the first systematic study of DNA copy number variation hotspots. As a faculty member at the University of Virginia (2007-2014), Washington University (2014-2020) and Yale (2020-present), his work has sought to understand the causes and consequences of genome variation in mammals, with an increasing focus on computational methods development and human genetics. His group has developed bioinformatics tools for variant detection, variant interpretation, sequence alignment, data processing, and data integration. He has led genome-wide studies of human genome variation, heritable gene expression variation, human genetic disorders, tumor evolution, mouse strain variation, genome stability in reprogrammed stem cells, and single-neuron somatic mosaicism in the human brain. Dr. Hall's work has been featured in Science Magazine's Breakthrough of the Year (2003 & 2007), the NIMH Director's "Ten Best of 2013" and The Scientist (2013), and he has received several prestigious awards including the AAAS Newcomb Cleveland Prize (2003), the Burroughs Wellcome Fund Career Award (2006), the NIH Director's New Innovator Award (2009), and the March of Dimes Basil O'Connor Research Award (2010). He has also served as an Associate Editor at Genome Research (2009-2014) and Genes, Genomes and Genetics (2011-2018).

Most recently, Dr. Hall has played a leadership role in several large collaborative projects funded by NIH/NHGRI including the Centers for Common Disease Genomics, the AnVIL cloud-based data repository and analysis platform, and the Human Pangenome Project. His current work is focused on two broad goals: (1) mapping variants and genes that confer risk to human disease, with ongoing projects focused on coronary artery disease and cardiometabolic traits in unique and underrepresented populations, and (2) developing methods for the detection and interpretation of human genome variation, with an emphasis on structural variation and other difficult-to-detect forms, and on comprehensive trait association in human disease studies.

Appointments

Education & Training

Postdoctoral Fellow
Cold Spring Harbor Laboratory (2004)
PhD
Cold Spring Harbor Laboratory, Genetics (2003)
BA
University of California, Berkeley (1998)

Research

Overview

Human genome structural variation. We are interested in basic questions related to the mutational causes and proximal molecular consequences of genome variation in populations, individuals and cells. We are especially interested in structural variation (SV), which includes large (≥50 bp) copy number variants (CNVs), mobile element insertions (MEIs) and genomic rearrangements. Although SVs are few in number compared to smaller-scale variants – each human carries ~10,000 SVs compared to ~3 million SNPs – they have more severe consequences on average due to their ability to alter gene dosage, disrupt gene function, or rearrange genes and regulatory elements. However, SVs have not been assessed in most disease studies due to technical challenges. We are pursuing three lines of research in this area. First, we are developing open-source bioinformatics tools for SV detection, genotyping, annotation, and impact prediction, to enable comprehensive genome analysis at the scale of human populations. Second, we are analyzing large-scale WGS datasets to characterize the landscape of SV in human populations. Knowledge of rare SV is limited relative to other variant classes, and this confounds SV interpretation in genetic studies and clinical efforts. We recently characterized SV in tens of thousands of human genomes, which produced a valuable resource for the community and revealed the contribution of deleterious SV to the rare variant burden, and we are now pursuing related work in larger and more diverse datasets. Finally, we are measuring the contribution of SV to human phenotypic variation, a hotly debated question with practical importance for the design of genetic association studies. We are directly assessing SVs in large-scale studies of cardiovascular disease (see below) and other common diseases, and we are studying the impact of genome variation on gene expression and other molecular traits across tissues and cells, as in our prior work from the GTEx project.

Genetics of coronary artery disease (CAD) and cardiometabolic traits. A major goal of our current work is to identify new variants and genes that contribute to cardiovascular disease. This is a relatively new area of research in my lab that we launched in 2016 as part of our NHGRI Center for Common Disease Genomics (CCDG) program, and is a close collaboration with Dr. Nathan Stitziel, a cardiologist at WashU. The main project is a case/control association study of early-onset CAD in ~40,000 individuals, where deep cardiometabolic trait measurements are available for many samples. Although there has been much prior work in this area using traditional GWAS, our approach has several advantages. First, we aim to comprehensively assess all forms of genome variation. Whereas standard GWAS interrogates a subset of variation using SNP arrays, our use of deep WGS allows us to study all variant classes, genome-wide, across the full allele frequency spectrum. Second, our multi-ethnic study focuses on unique and understudied populations that promise to carry novel risk alleles, including African Americans, Latinos and Finnish Europeans, each of which are informative for different reasons. African Americans exhibit high levels of genetic diversity and carry many variants that are absent in Europeans, and have not been included in most prior GWAS efforts. Admixed Latino populations carry a variable mixture of European, Native American and African haplotypes, and are also understudied. Finnish Europeans are the product of a unique population history that includes multiple ancient bottlenecks and a recent expansion, leading to an excess of deleterious low frequency variants in Finns that provide advantages for trait mapping. Finally, given that association studies alone are often not sufficient to identify causal variants, genes and mechanisms, we are leveraging single cell and multiomics data from relevant tissues and populations to help interpret the above studies.

Human Pangenome Project. The human reference genome is inarguably the most important and widely-used resource in the human genetics and genomics field. Yet, there is a growing realization that the current reference is inadequate to support the next generation of studies because it is a linear, haploid representation of haplotypes derived from multiple individuals, primarily of European descent, and thus does not adequately represent genetic diversity in the human population. This causes ancestry bias in key genomic applications, which can propagate to clinical assays and contribute to health disparities. We are co-leading a multi-site NHGRI-funded collaboration launched in 2019 that is building a human reference “pangenome” to replace the current reference (GRCh38). Our work in this project is focused on (1) building high quality genome assemblies from several hundred ancestrally diverse individuals using long-read data, (2) characterizing the full extent of genome variation in these assemblies, (3) representing these assemblies and variants in a pangenome graph that can be used for downstream applications, and (4) building next-generation computational tools and pipelines that leverage these data structures to enable comprehensive and unbiased genomic analyses.

Genomic data science. In addition to the work described above, we are working on several additional projects at the intersection of human genetics and data science. Two key challenges for human genetics research are sample size and data sharing. We are developing and applying methods for data aggregation, sharing and cloud-based analysis. We previously developed the “functional equivalence” data processing standard to enable harmonized analysis across genome sequencing studies, which alleviates the strong batch effects that would otherwise confound joint analyses and is now in use at most large genome centers worldwide. We are core members of the AnVIL project, a multi-site collaboration that is building a cloud-based data sharing and analysis platform that will store and provide access to vast amounts of genomic data generated by NHGRI and other NIH institutes. We are also mining data from various national and international research projects and biobanks, in order to increase sample size and provide replication for the human genetics studies described above, and to increase power for local biobanking and precision medicine efforts such as the Yale Generations Project.

Medical Research Interests

Data Science; Genetics; Genomics; Human Genetics

Research at a Glance

Yale Co-Authors

Frequent collaborators of Ira Hall's published research.

Publications

Featured Publications

Academic Achievements & Community Involvement

  • honor

    Basil O'Connor Starter Scholar Research Award

  • honor

    New Innovator Award

  • honor

    Burroughs Wellcome Fund Career Award in the Biomedical Sciences

  • honor

    Newcomb Cleveland Prize

Get In Touch

Contacts

Locations

  • TAC S-355B

    Academic Office

    The Anlyan Center

    300 Cedar Street

    New Haven, CT 06519

  • TAC S-360

    Lab

    The Anlyan Center

    300 Cedar Street

    New Haven, CT 06519

Events