Computational Genomics
A central problem in bioinformatics is the analysis of genomic information, culminating in the study of the human genome of an individual person (personal genomics). Research in genomic analyses includes annotating the genomes (such as coding and functional non-coding regions) through both computational and experimental approaches, dissecting gene regulation networks and signaling pathways, identifying disease-causing genes and variants (e.g., related to neuroscience), and cancer genomics.
Macromolecular Structure and High Resolution Imaging
Fundamentally, the genome encodes the structures of molecules, the machines that carry out the work of the cell. Determining structures involves analysis of high-resolution imaging data from techniques such as cryo electron microscopy. Analyzing structures involves dealing with complex 3D shapes and simulating them based on physical principles. One of the grand challenges to computational biology is ab initio prediction of protein structures as well as the elucidation of structure-to-function relationships.
Computational and Systems Immunology
Computational and systems immunology involves the development and application of bioinformatics methods, mathematical models, and statistical/machine learning techniques for the study of immune system biology. Systems approaches can be used to predict how the immune system will respond to a particular perturbation such as infection or vaccination, to infer and model the molecular and cellular networks that underlie immune responses and their dynamics, and to understand and rationally design effective immunotherapy. Computational approaches are also increasingly vital to transform the wealth of multi-omics data gathered from immune cells into biological insights. More generally, the integration of machine learning and dynamical/mechanistic modeling approaches are critical for achieving a predictive and quantitative understanding of the immune system. In addition, computational approaches are vital to empower the emerging field of synthetic immunology including the rational engineering of immune cells to serve as biosensors and therapeutic effectors.
AI Models and Distributed Analytics and AI Model Evaluation
Repositories holding large amounts of data from patients or study participants (in particular electronic health records) need to be reconceptualized. In particular, they may require researchers to compute within their enclaves. To build and/or evaluate models across several repositories, new approaches for distributed data analysis are necessary: from model constructing and testing, to internal and external validation, methods produce results that are often indistinguishable from those that would have been obtained by centralizing all data in one place. Distributed model evaluation methods address both classification performance and model calibration and bias that can be introduced by certain models. Analyzing clinical and population health data, in addition to molecular data, is important to extract patterns, optimize workflows, and build reliable predictive models using AI.
Machine Learning Techniques and Efficient Algorithms
Many theoretical and practical problems in the biological and biomedical sciences require unique algorithmic and computational solutions, involving machine learning, deep learning, combinatorial optimization, signal processing, and high-performance computing. For example, even simple processing of the extremely large-scale data generated by state-of-the-art genomics facilities requires considerable software and hardware development.