Skip to Main Content


Theme 1: Machine Learning and Algorithms for Biology and Medicine

Yale researchers have developed many widely used computational methods that analyze high throughput data, such as single cell sequencing data, to provide start-to-finish analytical ecosystems for large-scale biomedical datasets. These data are generated from experiments involving a wide range of systems, including infection, vaccination, allergy, autoimmune disease, aging, cancer, mental disorders, immunotherapy, hematopoeisis, cell cycle genomics, protein-protein interactions, human microbiota, and population genetics. Models using methods such as manifold learning and deep learning have been developed for supervised and unsupervised algorithmic approaches to process and visualize data, understand disease progressions, characterize phenotypic diversity, deconstruct evolutionary relationships, and infer causal mechanisms. This theme will leverage the strength of Yale researchers on the development of machine learning tools to address significant problems in biology and medicine.

Theme 2: Multi-omics Analytics

Advances in -omics technologies, such as genomics, transcriptomics, proteomics, metabolomics and immune repertoire, have begun to enable personalized medicine at an extraordinarily detailed molecular level. Multi-omics data are increasingly useful and there are active methodological developments for multi-omics data. Yale is a world leader in developing and enabling -omics technologies for biomedical research and has been engaged in many national and international programs, such as Brainspan, ENCODE, modENCODE, 1000 Genomes Project, PCAWG, the exRNA Consortium, IMPACC, the Human Immunology Project Consortium (HIPC) and the Center for Mendelian Diseases. Yale researchers have developed computational approaches to process large scale multi-omics data and data integration methods. This theme will focus on further developing methods and tools for the analysis of multi-omics data for both basic and translational research.

Theme 3: Moving from Genetic Association Signals to Causality and Function

A central problem in genetics research is deciphering how genomic variation affects the function of genes and results in disease or altered response to treatment. While results from genome-wide association studies (GWAS) offer some insights into the genetic basis of common diseases, innovative methods will allow us to effectively integrate diverse data types and/or sources of information to identify functional genes and variants and understand how they shape clinically relevant phenotypes. Yale researchers have developed and applied methods to address the major challenges in post GWAS analysis, such as how to move from a genetic association signal in a chromosomal region to finding disease-associated genes and causal variants, as a step towards understanding the underlying disease process. Fine mapping, sequencing, functional studies, and other approaches have been performed to find the causal variants involved in complex diseases. We will explore different computational and statistical approaches towards the integration of diverse data sources and biological knowledge to interrogate causal relationships between genetic variation and the functional phenotypes that underlie the pathophysiology of disease.

Theme 4: Genomic Health and Biomarkers

Individuals exhibit substantial heterogeneity due to genetics, environmental factors and life histories. These differences can influence not only the onset and trajectory of diseases, but also treatment efficacy, for example, how drugs are absorbed and metabolized in the body, as well as the response to preventative measures like vaccination. Yale researchers have been developing resources and tools to identify biomarkers that account for differences between individual patients. This is exemplified by the Generations Project to recruit 100,000 patients, for example, using genomics information to find the right drug for the right patient at the right time, with the help of the Yale Center for Genome Analysis. Another example is the development of epigenetic clocks that can be used for risk stratification. Many biomarkers, such as proteins and/or metabolites, offer valuable information for patient selection, monitoring disease onset, prognosis, pharmacodynamics, treatment effect, safety, and other clinical outcomes. There is a great interest in biomarker-driven approaches to assess the pharmacologic response to a therapeutic intervention and predicting treatment efficacy more quickly than conventional clinical endpoints thus accelerating product development. Yale researchers have been active in developing methods for biomarker discovery from -omics data and other data. This theme focuses on the development of computational and statistical methods to model and mine rich data sources (e.g. -omics data) to discover, validate, and classify promising biomarkers.

Theme 5: Electronic Health Records and Digital Health

Yale researchers have extensive experience building informatics infrastructure for clinical research, and performing research focused on issues such as data integration and the management of clinical vocabularies used in clinical research databases. VA research projects include broad informatics domains. Real world data (RWD), such as electronic health records (EHRs), can be used to understand effectiveness of therapies in patient sub-populations. Digital Health is promising to revolutionize healthcare delivery, optimize personalized and precision medicine, and offer new tools for drug and diagnostic development. Yale researchers have been developing technologies and applications of biosensors, mobile devices and wearables, Internet of Things, mobile health platforms, artificial intelligence, and digital biomarkers to quickly expand our research to areas of patient monitoring and disease management, point-of-care diagnostics, and digital endpoints in clinical trials. This theme will focus on designing and developing solutions to facilitate health informatics and digital health through annotation and analysis of rich RWD, EHRs, and digital health data in the context of -omics and other data.

Theme 6: Biomedical Imaging

Interdisciplinary Yale research teams are leaders in developing novel imaging technologies and analytical tools for diverse imaging modalities, including segmentation of deformable objects, image registration, measurement of neuroanatomical and cardiac function, strategies to track motion over time, image-guided neurosurgery, and database search tools using images. Developing these capabilities will allow more accurate characterization of disease status, shed further light on genotype-phenotype associations, and enable the identification of novel phenotype analysis. Prospective data modalities and disease areas may include fMRI, brain imaging for neurological and psychiatric disorders, cardiac imaging for cardiovascular conditions, whole-body MRI, whole slide images, liver images, and retinal images from for diseases such as obesity, nonalcoholic steatohepatitis, and retinopathy amongst others. This theme focuses on the development of computational methods for the analysis of multi-modality imaging data.