Research & Publications
My current research focus on two parts: (1) developing novel statistical and computational models to analyze large scale omics and drug perturbation data to better understand disease pathogenesis, and (2) understanding the heterogeneity and pathogenesis of pulmonary diseases, such as asthma, idiopathic pulmonary fibrosis (IPF), sarcoidosis, pediatric cystic fibrosis and so on, by tailoring statistical and computational methods based on existing biological knowledge of the diseases.
My team has been involved in multiple transcriptomic studies of asthma, IPF, sarcoidosis, cystic fibrosis and lung injuries in pediatric patients undertaking cardio bypass procedure. My team have analyzed various types of large-scale transcriptomic data generated by these studies which include microarray gene expression data, bulk RNA sequencing data, single cell RNA sequencing data, T cell receptor repertoire data and 16s rRNA sequencing data. For each study, we tailed our computational and statistical analysis of the data based on existing biological knowledge of the corresponding disease or condition. These analyses have made various discoveries in heterogeneity in asthma pathogenesis, cell type specific changes in asthma patients, heterogeneity and molecular biomarker of sarcoidosis, rare cell populations specific to IPF and COPD, potential antigen specific T cell clones for SARS-CoV-2 infection (COVID19) in adults and so on. My team is currently working with physicians and basic scientists to make further and more translational discoveries for the aforementioned and more pulmonary diseases.
During our extensive analyses of various types of omics data generated by our collaborators, my team also identify computational and statistical needs and develop novel methods to address these needs. For example, in the analysis of single cell RNA sequencing data (scRNA-seq), we found prevalent dropout events and dominant subject variation in the data, which was not well addressed by existing differential expression (DE) methods designed for scRNA-seq data. We developed iDESC to identify cell type-specific differentially expressed genes from scRNA-seq data with multiple subjects. The development of these tools further boosted our capacity and ability to analyze different types of OMICS data to better understand disease pathogenesis.
Extensive Research Description
Analysis of one time-point microarray and longitudinal bulk RNA sequencing data from asthma patients;
Analysis of longitudinal microbiome sequencing data from children with cystic fibrosis;
Bulk RNA sequencing of IPF, A1AT and SARC patients using Ion Torrent technology;
Single cell RNA sequencing data analysis;
Spatial single cell RNA sequencing data analysis;
Genetics; Lung Diseases; Respiratory Hypersensitivity; Computational Biology; Genomics; Biostatistics; Molecular Medicine
Public Health Interests
Bioinformatics; Biomarkers; Genetics, Genomics, Epigenetics; Microbial Ecology; Modeling