Skip to Main Content

Yale-BI Joint Research Committee (JRC) and Fellows

Yale-Boehringer Ingelheim Biomedical Data Science Fellows

  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '21

    Topic: Explainable machine learning models for indication expansionProject Summary: Bringing a novel therapeutic to market is a time-consuming and expensive endeavor. Many promising candidates identified through virtual screening and preclinical studies fail in clinical trials due to poor efficacy or lack of improvement in the standard of care. Instead, a target-centric approach, based on the repurposing of safe compounds with known mode of action (MoA), offers many advantages. First, alternative indications (diseases) with high medical need and market potential are identified for a given target, by linking information about the MoA, disease state, and patient populations obtained from large public and proprietary datasets. Subsequently, suitable disease models are chosen (e.g. by literature mining), and therapeutics are fast-tracked for approval, thus limiting the risk of failure. The aim of this project is to develop and integrate novel machine learning methods, with emphasis on explainability of the predicted outcome, to improve the overall performance of Boehringer Ingelheim’s drug repurposing pipeline.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '21

    Topic: A bioinformatics journey: from EHR to genetic dataProject Summary: As a postdoc at Yale Center for Biomedical Data Science, Xiayuan is going to work on high-throughput biomedical data, including electronic health records (EHRs) and genetic data. His research will focus on extending state-of-the-art machine learning approaches in health using EHRs, developing machine learning algorithms for drug discovery and adverse drug effects, and applying statistical methods to investigate the challenging problems in genetic data. Based on his PhD research, he believes family history linked EHRs succinctly encompasses shared genetic, epigenetic, and environmental features which enhance the analysis of human disease. He plans to apply machine learning algorithms in healthcare domain, such as disease risk prediction, precision medicine and clinical applications using family history linked EHRs. From the perspective of genetic data, his research work is devoted to addressing challenging problems in single-cell RNA sequencing data, developing innovative statistical models on analyzing the impact of genetic variants in human disease.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '21

    Topic: Identify brain functional subnetworks and associated genetic vulnerabilities with comorbid mental disordersProject Summary: Emerging evidence indicates that 1) boundaries of psychiatric illnesses are not sharp with behavioral traits and brain function; 2) individual behavioral differences are linked to variability in functional brain networks. These suggest that the spectra of symptom profiles observed in patients may arise through discernible patterns of functional connectome, with the disturbance of individual systems preferentially contributing to domain-specific, but disorder-general, impairments. Meanwhile, gene transcription could strongly correlate with network topography, potentially driving comorbidity between symptomatically related disorders. Hence, our overarching goal is to identify brain functional network fingerprints, link them to dimensional symptom profiles, and characterize the associated genetic underpinnings through a suite of powerful, biologically plausible and computationally efficient statistical models. Successful completion of this research will discover network-level biomarkers and associated genetic vulnerabilities, and facilitate the development of novel treatments and future classification schemes.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '22

    Topic: Multi-omics Analytics and Emerging TechnologiesProject Summary: Graph genome-based models can characterize genetic variation across both microbial organisms and diverse regions of the human genome. We aim to investigate whether these models can also be used to characterize the extensive genetic diversity observed within immunogenetic sequencing datasets (e.g., B cell receptor (BCR) repertoire sequencing). We will develop graph-based approaches to 1) analyze high-throughput immunogenetic sequencing (e.g., BCR repertoire profiling) and 2) perform genetic association tests focused phenotypes related to the host immune response to vaccines, infection, therapeutics developed by Boehringer Ingelheim, and autoimmune diseases. We will also assess whether graph structure/topology is clinically informative and, by annotating regions across the graph using external multi-modal data, assess whether annotated genome graphs can facilitate immunogenetic-focused genome-wide association studies.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '22

    Topic: Multimodal network-based cancer heterogeneity analysisProject Summary: In the past decade, the maturity of profiling techniques has led to the discovery that previously defined cancer types/subtypes, which is based on pathological images, can be further classified into sub-subtypes. This refined classification has different omics landscapes and clinical paths and demand different treatment strategies. Accordingly, the first guiding principle of this study is that effectively integrating multimodal data, in particular pathological imaging and multi-omics data, can lead to more refined cancer heterogeneity structures. In heterogeneity analysis, incorporating the interconnections among variables can future reveal more subtle cancer heterogeneity structures. As such, the second guiding principle is that utilizing cutting-edge methods to incorporate interconnections can further improve cancer heterogeneity analysis. Our overarching goal is to develop more effective statistical learning methods for cancer heterogeneity analysis, which can deepen our understanding of cancer biology and facilitate more personalized treatment.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '22

    Topic: Multi-omics Analytics and Emerging TechnologiesProject Summary: Recent effort has been made in using CRISPR knockout or activating to screening target to boost T cell effector function and further leverage the immune killing function. However, the manipulating of a single gene might still be hard to overcome the resistance due to genes that can compromise its function in immune cell signaling. Paralogs derived from the same ancestors are reported with synthetic lethal interactions, which might function jointly in augmenting cancer immunity. In this project, we will establish a computational model for predicting paralogs pairs that can team up their function in cancer immunotherapy, by integrating genome-wide CRISPR screens perturbation molecular profiles from Cancer Dependency Map (DepMap) and Connectivity Map (CMap), and cancer datasets with patients receiving immunotherapy. The outcome of this research will deliver in silico tools for screening paralog pairs that can boost immune response, which could inspire effective combination therapeutic strategies toward precision treatment.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '22

    Topic: Mechanism-based identification of biomarkers and intervention targets from multi-omics datasets Project Summary: Recent technological advances allowing for the global characterization of genomic variants, transcription profiles, epigenomic profiles, and protein markers, often down to the single-cell level, have provided unprecedented insights into the homeostatic and perturbed states of biological systems. Analyzing these vast multi-omics datasets to obtain clinically actionable biomarkers and promising intervention targets remains a formidable challenge. Prediction of the individual immune response quality and quantity in health and disease is one quintessential case. Our proposed research will combine statistical analyses with causal inference and multi-scale mathematical modeling to develop a multi-omics data analysis pipeline that (a) provides mechanistic insights into the underlying biological process, (b) captures the diversity seen across individuals, and (c) identifies complex features and rules that are predictive of the response to perturbations. We will apply our approach to datasets characterizing the vaccination response to identify predictive biomarkers and intervention targets to improve vaccine efficacy.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '23

    Topic: Moving from GWAS to Casual Genes and Variants Project Summary: The size and ethnic diversity of emerging sequencing datasets are growing rapidly. Combining these data with emerging single cell omic datasets and AI models for predicting gene activity (eg: expression) offers an unprecedented opportunity to uncover the causal genes and cell types that drive human traits and disease. However, in emerging sequencing datasets, the strong, often perfect, linkage among associated ultra-rare variants can yield an unwieldy list of candidate causal variants. This problem is exacerbated by the presence of multiple causal variants (allelic heterogeneity) and migration events, both of which are more common in ethnically diverse datasets. This fine mapping enigma motivates our current research. Using novel statistical methods, we aim to develop an automated yet interpretable approach that does not seek to isolate causal variants, but rather to directly identify target genes and pathways from phenotypic and single cell xQTL data across different cohorts.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '23

    Topic: Moving from GWAS to Casual Genes and Variants Project Summary: Genome-wide association studies have revealed many disease-associated variants, but identifying causal variants is challenging. This proposal aims to address this by developing deep learning models to predict regulatory effects of variants based on massively parallel reporter assay data. By exploring how these functional predictions relate to genomic signatures of natural selection across evolutionary timescales and disease categories, it will assess when evolutionary constraint can serve as an informative proxy for regulatory function in interpreting GWAS. The predictive models and constraint maps will then be applied to nominate likely causal variants from GWAS of autoimmune and metabolic diseases. This research will nominate candidate causal variants across diverse human ancestry groups, advancing understanding of the precise regulatory mechanisms disrupted in common human diseases. Overall, this proposal develops and applies integrative genomic methods to elucidate complex disease genetics.
  • Yale-Boehringer Ingelheim Biomedical Data Science Fellow '23

    Topic: Multi-omics Analytics for Personalized Medicine Project Summary: First, we will develop a framework that integrates spatial transcriptomics, single-cell RNA-seq, single-cell ATAC-seq, high-resolution imaging, and single-cell targeted protein data to identify tissue microenvironments. By utilizing network-based variable selection and regression of cell morphology, we will aggregate selected features using cell adjacency matrices to cluster tissue areas into microenvironments. This multi-modal integration promises to uncover new microenvironment characteristics for targeted therapeutics. Second, we will focus on identifying disease progression-associated changes in tissue microenvironments. Using known biomarker genes, we will differentiate microenvironments and assess disease severity and progression. We will analyze changes in cell compositions, expression profiles, gene regulatory networks, and cell-cell communication networks. Deconvolved spatial transcriptomics and causal network approaches will aid in constructing gene regulatory networks, while Connectome and graph attention network methods will establish cell-cell communication networks. Correlations with disease progression will be examined independently and combined using neural networks to gain a comprehensive understanding for precise therapeutic development.

Yale Mentors

  • Associate Professor of Genetics and of Computer Science

    Smita Krishnaswamy is an Associate professor in Genetics and Computer Science. She is affiliated with the applied math program, computational biology program,  Yale Center for Biomedical Data Science and Yale Cancer Center. Her lab works on the development of machine learning techniques to analyze high dimensional high throughput biomedical data. Her focus is on unsupervised machine learning methods, specifically manifold learning and deep learning techniques for detecting structure and patterns in data. She has developed algorithms for non-linear dimensionality reduction and visualization, learning data geometry,  denoising, imputation, inference of multi-granular structure, and inference of feature networks from big data. Her group has applied these techniques to many data types such as single cell RNA-sequencing, mass cytometry, electronic health record, and connectomic data from a variety of systems. Specific application areas include immunology,  immunotherapy, cancer, neuroscience, developmental biology and health outcomes. Smita has a Ph.D. in Computer Science and Engineering from the University of Michigan.
  • Associate Professor of Biostatistics

    Dr. Wang is Associate professor of Biostatistics at Yale School of Public Health. Her research focuses on combining genetics, genomics, immunology, and statistical modeling to answer biologically important questions in genetic epidemiological studies. Dr. Wang's statistical expertise lies in longitudinal data analysis, varying coefficient models, mixed effects models, kernel machine methods, mediation analysis, machine learning methods, and network analysis. She develops statistically innovative methods and computationally efficient tools in large-scale genetic and genomic studies to identify genetic susceptibility variants and advance the understanding of the etiology of complex diseases including breast cancer, alcohol and drug abuse, asthma, autism, obesity, lung and cardiovascular diseases. Current studies include using next-generation sequencing data to detect rare genetic variants in longitudinal genetic studies, combining knowledge in genomics and immunology to understand the risk of breast cancer survival, addressing statistical challenges in single-cell RNA sequencing data and spatial transcriptomics, and machine learning for risk prediction in electronic health records data.
  • Associate Professor of Biostatistics

    Dr. Zhao is an Associate Professor in the Department of Biostatistics at Yale School of Public Health. She is also affiliated with Yale Center for Analytical Sciences, Yale Alzheimer's Disease Research Center and Yale Computational Biology and Bioinformatics. Her main research focuses on the development of statistical and machine learning methods to analyze large-scale complex data (imaging, -omics, EHRs), Bayesian methods, feature selection, predictive modeling, data integration, missing data and network analysis. She has strong interests in biomedical research areas including mental health, mental disorders and aging, etc. Her most recent research agenda includes analytical method development and applications on brain network analyses, multimodal imaging modeling, imaging genetics, and the integration of biomedical data with EHR data.  Her research is supported by multiple NIH grants.  Dr. Zhao received her Ph.D. in Biostatistics from Emory University and postdoc training at Statistical and Applied Mathematical Sciences Institute (SAMSI) and the University of North Carolina at Chapel Hill. Prior to coming to Yale, she was an Assistant Professor in Biostatistics at Cornell University, Weill Cornell Medicine.
  • Anthony N Brady Professor of Pathology; Co-Director of Graduate Studies, Computational Biology and Bioinformatics

    Dr. Steven Kleinstein is a computational immunologist with a combination of big data analysis and immunology domain expertise. His research interests include both developing new computational methods and applying these methods to study human immune responses. Dr. Kleinstein received a B.A.S. in Computer Science from the University of Pennsylvania and a Ph.D. in Computer Science from Princeton University. He is currently Professor of Pathology (with a secondary appointment in Immunobiology) at the Yale School of Medicine, and a member of the Interdepartmental Program in Computational Biology and Bioinformatics (CBB), and the Human and Translational Immunology Program. Specific areas of research focus include:High-throughput single-cell B cell receptor (BCR) repertoire profiling (AIRR-seq, Rep-seq, scRNA-seq+VDJ)Multi-omic immune signatures of human infection and vaccination responses
  • Department Chair and Professor of Biostatistics; Affiliated Faculty, Yale Institute for Global Health; Director, Biostatistics and Bioinformatics Shared Resource

    Dr. Ma received his Ph.D. degree in statistics at University of Wisconsin in 2004. Prior to arriving at Yale, Dr. Ma was a Senior Fellow in Collaborative Health Studies Coordinating Center (CHSCC) and Department of Biostatistics at University of Washington. He has been involved in developing novel statistical and bioinformatics methodologies for analysis of cancer (NHL, breast cancer, melanoma, lung cancer), mental disorders, and cardiovascular diseases. He has also been involved in health economics research, with special interest in health insurance in developing countries.
  • Associate Professor of Genetics and of Neurosurgery

    Sidi Chen joined the Yale Faculty in 2015 in the Department of Genetics, Systems Biology Institute, and Yale Cancer Center. His research focuses on providing a global understanding of biological systems and development of novel breakthrough therapeutics. Chen developed and applied genome editing and high-throughput screening technologies, precision CRISPR-based in vivo models of cancer, global mapping of functional drivers of cancer oncogenesis and metastasis. He is leading a research group to seek global understandings of the molecular and cellular factors controlling disease progression and immunity. His group continuously invents versatile systems that enable rapid identification of novel targets and development of new modalities of cancer immunotherapy, cell therapy and gene therapy. His goal is to uncover novel insights in cancer and various other immunological diseases and develop next generation therapeutics.  Dr. Chen received a number of national and international awards including the Pershing Square Sohn Prize, DoD Era of Hope Scholar, NIH Director’s New Innovator Award,  Blavatnik Innovator Award, Yale Cancer Center Basic Science Research Prize, AACR NextGen Award for Transformative Cancer Research, Ludwig Foundation Award, Damon Runyon Cancer Research Fellow, Dale Frey Award for Breakthrough Scientists, TMKF Innovative/Translation Cancer Research Award, BCA Exceptional Research Grant Award, MRA Young Investigator Award, V Scholar, Bohmfalk Scholar, Ludwig Family Foundation Award, St. Baldrick’s Foundation Award, CRI Clinic & Laboratory Integration Program (CLIP), MIT Technology Review Top 35 Innovators (Regional), and Sontag Foundation Distinguished Scientist Award.
  • Professor of Immunobiology and Biomedical Engineering; Director, Yale Center for Systems and Engineering Immunology (CSEI)

    John Tsang is a systems immunologist, computational biologist, and engineer. He is currently Professor of Immunobiology and Biomedical Engineering at Yale University and the Founding Director of the Yale Center for Systems and Engineering Immunology (CSEI). The CSEI serves as a home and cross-departmental center of research for systems, quantitative, and synthetic immunology at Yale. Dr. Tsang earned his PhD in biophysics and systems biology from Harvard University (2008) as an NSERC Postgraduate Scholar and trained in computer engineering (BASc) and computer science (MMath) at the University of Waterloo, Canada. Dr. Tsang's group investigates why immune system statuses and responses to perturbations (e.g., to SARS-CoV-2 infections) are highly variable across individuals in the human population, i.e., the molecular and cellular underpinnings of human immune variations in health and disease. Their approach involves the development and application of computational, quantitative modeling, and experimental methods, including high-dimensional, longitudinal immune monitoring of human cohorts throughout the lifespan and around the globe, machine learning, dynamical modeling, and ex vivo experiments and animal models. As a scientific conceiver and Yale lead of CZ Biohub NY, Dr. Tsang is interested in developing a predictive immune cell engineering toolkit to program immune cells as sensors of tissue statuses (e.g., early detection of pre-clinical disease and inflammation). Towards achieving this vision, he and his colleagues are working on quantitatively dissecting the mechanisms and design principles of tissue-blood communications and immune cell trafficking, including cell-cell interaction and signal integration by immune cells in tissues. He has won multiple awards for his research, including NIH/NIAID Merit Awards recognizing his scientific leadership in systems immunology, COVID-19, and human immunology research. His work on mapping human immune variations and predicting vaccination responses was selected as a Top NIAID Research Advance of 2014. Dr. Tsang has served as an advisor on systems immunology and computational biology for numerous programs and organizations, including the Allen Institute, World Allergy Organization, National Cancer Institute, National Institute of Allergy and Infectious Diseases, National Institute of Diabetes and Digestive and Kidney Diseases, and the Fred Hutchinson Cancer Center. He currently serves on the Editorial Board of PLOS Biology and the Scientific Advisory Board of NIAID ImmPort, the NIAID Influenza IMPRINT Program, the NIH Common Fund Cellular Senescence Network (SenNet), Vaccine and Immunology Statistical Center of the Gates Foundation, the Human Immunome Project, ImmunoScape Inc., and CytoReason Ltd. He has lectured at many meetings and academic institutions and was lead organizer of major scientific conferences, including Keystone and Cold Spring Harbor Laboratory meetings on systems and engineering immunology. Prior to joining Yale, Dr. Tsang was a tenured Senior Investigator in the National Institutes of Health's Intramural Research Program and led a laboratory focusing on systems and quantitative immunology at the National Institute of Allergy and Infectious Diseases (NIAID). He was the Co-Director of the Trans-NIH Center for Human Immunology (CHI) and led its research program in systems human immunology. He remains an Adjunct Investigator at NIAID.
  • Professor of Genetics, Director of the Yale Center for Genomic Health

    Dr. Hall's research career spans the fields of genetics, genomics, bioinformatics and data science. He received a B.A. in Integrative Biology from the University of California at Berkeley (1998), and worked as a technician for 2 years in Sarah Hake's plant genetics group at the USDA/ARS Plant Gene Expression Center. He received his Ph.D. in genetics from Cold Spring Harbor Laboratory (2003), where his work in Shiv Grewal's laboratory established the first direct link between RNA interference and chromatin-based epigenetic inheritance. As a postdoc with Michael Wigler (2004) and independent Cold Spring Harbor Laboratory Fellow (2004-2007), Dr. Hall used microarray technologies and mouse strain genealogies to conduct the first systematic study of DNA copy number variation hotspots. As a faculty member at the University of Virginia (2007-2014), Washington University (2014-2020) and Yale (2020-present), his work has sought to understand the causes and consequences of genome variation in mammals, with an increasing focus on computational methods development and human genetics. His group has developed bioinformatics tools for variant detection, variant interpretation, sequence alignment, data processing, and data integration. He has led genome-wide studies of human genome variation, heritable gene expression variation, human genetic disorders, tumor evolution, mouse strain variation, genome stability in reprogrammed stem cells, and single-neuron somatic mosaicism in the human brain. Dr. Hall's work has been featured in Science Magazine's Breakthrough of the Year (2003 & 2007), the NIMH Director's "Ten Best of 2013" and The Scientist (2013), and he has received several prestigious awards including the AAAS Newcomb Cleveland Prize (2003), the Burroughs Wellcome Fund Career Award (2006), the NIH Director's New Innovator Award (2009), and the March of Dimes Basil O'Connor Research Award (2010). He has also served as an Associate Editor at Genome Research (2009-2014) and Genes, Genomes and Genetics (2011-2018).Most recently, Dr. Hall has played a leadership role in several large collaborative projects funded by NIH/NHGRI including the Centers for Common Disease Genomics, the AnVIL cloud-based data repository and analysis platform, and the Human Pangenome Project. His current work is focused on two broad goals: (1) mapping variants and genes that confer risk to human disease, with ongoing projects focused on coronary artery disease and cardiometabolic traits in unique and underrepresented populations, and (2) developing methods for the detection and interpretation of human genome variation, with an emphasis on structural variation and other difficult-to-detect forms, and on comprehensive trait association in human disease studies.
  • Assistant Professor of Genetics

    Steven Reilly received his B.S. in Biology from Carnegie Mellon University in 2009. Motivated by the rapid emergence of new technologies to map the full epigenomes, he joined Jim Noonan's Lab in the Genetics Department of Yale School of Medicine. There he built gene regulatory maps of the developing human, rhesus, and mouse cortex to identify changes underlying unique aspects of human brain morphology and cognitive abilities. Steve received his Ph.D. in 2015 and then joined the laboratory of Pardis Sabeti at the Broad Institute of Harvard and MIT to interrogate the function of genetic variants at the intersection of natural selection and human disease. As evolutionary adaptive genetic variants have been shown to underlie diversity in disease risk and morphology across human populations, the lens of evolution remains a powerful, yet underutilized method for understanding human biology He is specifically interested in furthering our understanding of non-coding variation, the main cache of human genetic diversity. The has created novel machine-learning methods to predict the subset of human variants under selection that are functional, and experimental methods to characterize variants in a massively parallel fashion. Steve has developed endogenous CRISPR perturbation methods and synthetic DNA technologies coupled with genomic readouts to directly assess the cellular phenotypes of non-coding alleles. Steve joined the Yale Department of Genetics as an Assistant Professor in September, 2021.  The Reilly lab develops and applies new high-throughput experimental approaches to interrogate the genome, such as non-coding CRISPR screens and the Massively Parallel Reporter Assay. Computationally, we also develop machine-learning approaches to predict the functions of these CRE perturbations.  Together with these new tools, we use evolution as a powerful lens for characterizing genomic signals of positive selection that impact modern human phenotypes and diseases. The lab has three main foci: Developing new, large-scale experimental screens to perturb CREs, and new computational tools to model their functionIdentifying evolutionary adaptive alleles likely impacting modern human phenotypesApplying these functional genomic tools to phenotypically interesting loci important for human disease and evolution.
  • Associate Professor of Medicine (Pulmonary, Critical Care and Sleep Medicine); Director of Data Analysis and Bioinformatics Hub, The Center for Precision Pulmonary Medicine (P2MED); Assistant Professor, Biostatistics

    Dr. Yan received doctoral degrees in both applied statistics and computational biology and bioinformatics. She is interested in genetics, genomics, computational biology, biostatistics, system biology and bioinformatics. Her current research topics include (1) understanding disease heterogeneity and pathogenesis using large-scale omics data at both bulk and single cell resolution and (2) developing novel statistical and computational methods for analyses of different types of omics data and the integration of them with drug perturbation data for potential personalized treatment design.

Boehringer Ingelheim Mentors

  • Global Computational Biology and Digital Sciences

    Gregorio Alanis-Lobato obtained his MSc in Computer Science and his PhD in Computational Biology at KAUST developing methods to predict protein-protein interactions based on the topological structure of experimentally-derived networks. Then, he moved to the Johannes Gutenberg University in Mainz, Germany to continue his research on this topic as a postdoctoral fellow and to enhance the functionality of HIPPIE, a webtool to construct reliable and context-specific human protein networks. This was followed by a second postdoc at the Francis Crick Institute in London, where he worked on the integration of different single-cell omics data modalities for the construction of gene regulatory networks in early human embryos. In addition, he developed computational pipelines to assess whether CRISPR-Cas9-targeted human preimplantation embryos had unintended on-target mutations based on single-cell genomics and transcriptomics datasets. Gregorio joined BI in late 2020 to support pre-clinical research for the CNS Diseases and Research Beyond Borders therapeutic areas with his expertise in omics data analysis and integration. LinkedIn Google Scholar
  • Senior Principal Scientist, Global Computational Biology and Digital Sciences

    Dr. Frank Li is a Senior Principal Scientist at Department of Global Computational Biology and Digital Sciences, Boehringer Ingelheim, Ridgefield, CT. In this role, he oversees bioinformatics analyses using systems biology approaches to better understand and/or characterize the pathological mechanisms contributing to the onset and progression of autoimmune diseases, including Inflammatory Bowel Disease, Systemic Sclerosis, Idiopathic Pulmonary Fibrosis, etc. His group is using holistic “omics” approaches to facilitate discoveries of novel therapeutic concepts, determination of novel biomarkers, and patient stratification and enrichment supporting clinical drug development. He obtained his Ph.D. from the University of North Carolina at Chapel Hill with his doctoral research focus on the breakdown of immunological tolerance of autoreactive CD4+ T cells in autoimmune Type I diabetes. Then, he continued his postdoctoral studies at Harvard Medical School where he worked in the fields of T cell tolerance attempting to decipher the molecular and cellular mechanisms that control T cell differentiation, and the molecular mechanisms controlling the plasticity of FOXP3+ Treg cells using both canonical immunological and systems biology approaches. Linkedin
  • Head of Human Genetics

    Dr. Zhihao Ding was trained in his PhD at the Wellcome Trust Sanger Institute, the University of Cambridge, studying the genetics of cellular traits and genomic algorithms. He had a Postdoc at the Wellcome Trust Centre for Human Genetics, University of Oxford, studying the genetics of rare diseases and cancers. In 2015, he transitioned to industry working at Genomics PLC, Oxford, UK, where he developed algorithmic solutions and products for rare disease diagnostics. He led work packages for a panel of pharmaceutical companies in evaluating specific targets in disease areas of therapeutic interests. Before joining Boehringer Ingelheim (BI), Dr. Zhihao was leading target projects on NASH at Novo Nordisk Research Centre Oxford (NNRCO). Dr. Zhihao was one of the first genetic scientists joined the NNRCO, where he helped build the genetic capacity and initiated several academic research collaborations with the Big Data Institute, University of Oxford. Dr. Zhihao joined BI as the Head of Human Genetics in July 2020, where he’s leading genetic initiatives in gCBDS. LinkedInGoogle Scholar
  • Computational Biology Expert Lead

    Dr. Di Feng is a senior member of GCBDS (Global Computational Biology and Digital Science), working at Boehringer Ingelheim's US headquarters in Ridgefield, CT on computational drug discovery research. He is a Computational Biology professional with substantial multidisciplinary expertise in Computational Immunology, Pathology, and Machine Intelligence applications for drug discovery. Dr. Feng managed the Artificial Intelligence and Machine Learning partnership with The Center of Computational Imaging and Personalized Diagnostics (CCIPD) at Case Western's University Hospital, the Cleveland Medical Center. Dr. Feng has also contributed open source software tools such as Single Cell Explorer, a platform to facilitate the collaboration between computational biologists and experimental scientists. Dr. Feng led computational projects for a small molecule and biological drug program from early research to clinical trials. He has worked with and led teams to solve complex research challenges using computational approaches across multiple therapeutic areas such as Cancer Immunology, Immunomodulation, Immunology and Respiratory, and Cardiometabolic diseases. He received Ph.D. from Rutgers - Graduate School of Biomedical Sciences, studying basic and clinical biology of plasmacytoid dendritic cells, followed by postdoc research on autoimmune and cancer susceptibility genes with integrating bioinformatics with wet lab science. He also earned his medical degree from Shanghai Jiao Tong University School of Medicine. Prior to joining Boehringer Ingelheim, he developed therapeutics supported by Lupus Research Alliance. LinkedIn
  • Team Lead Statistical Modeling

    Johann de Jong obtained his PhD in Computational Cancer Biology from Delft University of Technology. Since then, he has gained experience in a wide variety of domains, ranging from gene regulation and chromatin biology to cancer research and neurological disorders, in both academia (the Netherlands Cancer Institute) and industry (BASF, UCB Pharma). He currently leads the Statistical Modeling team within Human Genetics at Boehringer Ingelheim, which focuses on developing and applying novel machine learning and statistical models for biomarker/target identification and patient stratification by integrating prior knowledge with multi-modal and longitudinal data sources including human biobanks. LinkedIn Google Scholar
  • Principal Scientist, Global Computational Biology and Digital Sciences

    Zuojian Tang is a principle scientist of Global Computational Biology and Data Sciences at Boehringer Ingelheim (BI) with extensive experience alongside both computational biology and bioinformatics engineering. Prior to joining BI, she worked as bioinformatics engineer for Memorial Sloan Kettering Cancer Center. She also spent about 10 years with New York University Langone Health as senior research scientist. Zuojian has designed and developed widely recognized and adopted analysis methods and systems for various computational biological applications. She has more than 35 peer reviewed full-length papers published with more than 3000 citations. Zuojian received her Ph.D. of Systems and Computational Biomedicine from New York University, Master of Computer Science from McGill University, Canada, and Bachelor of Engineering in China.LinkedIn Google Scholar
  • Principal Scientist in Human Genetics

    Ingrid joined BI in February 2023 as a Principal Scientist in Human Genetics. She leads target discovery projects together with partners of key disease areas and contributes to driving collaborations with both internal and external partners. Her work also focuses on building translational capabilities in using multi-omics for biomarkers and patient stratification. Ingrid received her PhD in Bioinformatics in Germany from the University of Lübeck before becoming a group leader supervising master students and PhD candidates at the same university. During her time in Lübeck, she mainly focused on the genetics underlying complex diseases. She then moved to the US to work as a senior scientist at the University of Virginia’s Center for Public Health Genomics focusing on Type 1 Diabetes. Before joining BI, Ingrid worked at AZ in the field of immuno-oncology where she was leading the Computational Pathology strategy for several drug and cell therapy programs. LinkedIn Google Scholar
  • Associate Scientific Director – Team Lead; Global Computational Biology and Digital Sciences

    Markus Bauer studied computer science at the Technical University Vienna, graduating with an MSc in Technical Computer Science in 2004. Markus then joined the International Max Planck Research School for Computational Biology and Scientific Computing (IMPRS-CBSC, now IMPRS for Biology and Computation) as a PhD student where he was working on algorithms for performing sequence-structure alignments. In 2008, he started as a research scientist at Illumina Cambridge contributing to projects ranging from algorithms (sequencing mapping, BWT construction), whole-genome studies (sequencing of the Tasmanian devil) as well as assay development (TruSeq Amplicon Cancer Panel). He joined Boehringer Ingelheim in 2012 as a research scientist in the Therapeutic Area of Cancer Research working on various pre-clinical programs as well as different data assets. Since 2022 Markus is leading a team within the Global Computational Biology and Digital Sciences department (gCBDS) focusing on data engineering and advanced technology stacks.. LinkedIn
  • Principal Scientist, Global Computational Biology and Digital Sciences

    Daniel joined Boehringer Ingelheim at the end of 2021. He is a Principal Scientist and Partner for Obesity research. His focus is on integration and interpretation of human multi-omic data with the aim of better understanding obesity biology and identifying new therapeutic targets. Daniel obtained his PhD in obesity neurobiology from the University of Cambridge. He then pursued postdoctoral studies at the University of Michigan in mouse genetics, and at Stanford University in functional genomics. He then moved to the Helmholtz Centre in Munich, where he did functional genomic research on neurological disorders. Directly before joining BI, he worked at the Nucleic Acid Therapy Accelerator in Harwell, England. LinkedIn Google Scholar
  • Principal Scientist, Global Computational Biology and Digital Sciences

    Alexandra Popa has joined Boehringer-Ingelheim as a Principal Scientist in the Computational Department since 2020. Her research is focused on the identification of new targets as well as the establishment of strong biomarkers in the field of oncology. Prior to her current position, she worked at the CeMM Institute in Austria (studying the evolution and impact of the SARS-COV-2 pandemic, the Tasmanian Devil transmissible cancer, the virus-induced liver immune-metabolism), the IPMC Institute in France (investigating the translational mechanism of proteins in cells and profiling immune cell populations in cutaneous squamous cell carcinoma), and the LBBE laboratory in France (examining the mechanisms of transcription processing during alternative splicing). Dr. Popa has obtained her PhD from the Universite Claude Bernard Lyon 1 in France on the topic of recombination-induced genome evolution changes. LinkedIn Google Scholar
  • Senior Scientist in the Department of Global Computational Biology and Digital Sciences

    Dr. Katja Koeppen is a Senior Scientist in the Department of Global Computational Biology and Digital Sciences at Boehringer Ingelheim in Ridgefield, Connecticut. Her focus is on research and drug target discovery in immunology and respiratory diseases. She is currently developing a new gene prioritization algorithm to accelerate the identification of novel drug targets. Dr. Koeppen was originally trained as a biochemist and molecular biologist and obtained her PhD from the University of Tuebingen in Germany. Over 10 years ago, she started transitioning from the wet lab to computational biology during her research on Cystic Fibrosis at Dartmouth College. Dr. Koeppen has extensive experience analyzing complex data sets and teaching these skills to others. She has developed and published several web applications that enable data analysis by scientists without computational skills. She is passionate about harnessing high throughput data to gain a better understanding of disease mechanisms and to identify novel drug targets and treatment options. LinkedIn ‪Google Scholar