Two distinguished Yale professors, Dr. Hongyu Zhao, PhD, and Dr. Mark Gerstein, PhD, have been awarded a $1.9 million grant from the National Human Genome Research Institute (NHGRI) to advance the Developmental Genotype-Tissue Expression (dGTEx) project. This landmark initiative aims to unravel the complexities of gene expression patterns across developmental stages, providing critical insights into genetic influences on health and disease.
Zhao, the Ira V. Hiscock Professor of Biostatistics, and a professor of genetics and professor of statistics and data science at the Yale School of Public Health, and Gerstein, the Albert L. Williams Professor of Biomedical Informatics and a professor of molecular biophysics and biochemistry, computer science, and statistics and data science at Yale, will lead the multidisciplinary effort. The grant, in the form of a U01 award, is intended to support cooperative research initiatives that address specific scientific areas of interest.
Unraveling the Secrets of Human Development
The dGTEx project, co-funded by numerous NIH institutes including the NHGRI and the Eunice Kennedy Shriver National Institute for Child Health and Human Development (NICHD), focuses on studying gene expression patterns in tissues from recently deceased pediatric donors, ranging from neonates to adolescents. This research is poised to create a comprehensive molecular and data analysis resource and a tissue bank that will serve as a reference for understanding gene regulation in relatively healthy pediatric tissues.
By examining developmental gene expression across a broad spectrum of tissues, dGTEx aims to reveal the intricate genetic and molecular mechanisms that underpin human development.
"Our study will develop novel statistical approaches and tools and provide valuable data that will enhance our understanding of how gene expression regulation evolves from infancy through adolescence in different tissues and cell types, ultimately offering insights into a myriad of disease conditions that originate during these critical years,” Zhao said.
The project also includes a comparative analysis of gene expression patterns in non-human primates, examining developmental trajectories in species such as the rhesus macaque and the common marmoset. These comparisons are instrumental in interpreting human data and understanding the broader evolutionary context of gene expression.
Challenges and Innovations
Zhao and Gerstein intend to develop novel and robust statistical methods to tackle the inherent challenges of eQTL (expression quantitative trait loci) analysis in the dGTEx data. A major deliverable of the previous GTEx project was the establishment of eQTLs, which link genetic variations to gene expression changes. The dGTEx project's focus on pediatric tissues introduces new complexities due to the limited number of available samples but also presents opportunities with the advent of single-cell sequencing technology.
"Our proposed methods will leverage Bayesian statistical techniques to enhance the power of eQTL inference, even with fewer samples," Gerstein said. "By incorporating cell-type-specific data from single-cell sequencing and integrating this with allele-specific expression analysis, we aim to achieve a more refined and comprehensive understanding of genetic regulation at various developmental stages."
Pioneering Data Integration
One of the key objectives of the research is the integration of dGTEx data with external datasets, including those from the original GTEx project and other genomic consortia. This integration effort is crucial to overcoming sample size limitations and ensuring that the findings are robust and widely applicable. The exploration of deep learning techniques for genetic variant predictions and gene expression imputation models will further extend the utility of the dGTEx data.
"Our goal is to create a resource that can be widely used by the scientific community to investigate complex traits and diseases,” Zhao said. “By sharing QTL call sets and an allele-specific catalog via the dGTEx portal and the Analysis Visualization and Informatics Lab-space (ANVIL), we aim to facilitate broader research and collaborations."
Impacts on Health Care
The dGTEx project stands to significantly advance our understanding of how genetic variations influence gene expression differently across developmental stages. This knowledge can lead to early identification of disease risks and the development of age-specific interventions and treatments. The robust resources generated by Zhao and Gerstein's work will enable researchers worldwide to explore the genetic underpinnings of pediatric diseases, ultimately striving towards more precise and effective health care solutions.
"By dissecting the genetic regulation of gene expression in developing tissues, we are opening new doors to personalized medicine and innovative therapeutic strategies for the next generation,” Gerstein said.