Lung Diseases; Respiratory Hypersensitivity
Public Health Interests
Bioinformatics; Biomarkers; Biostatistics; Data analysis; Genetic epidemiology; Genetics; Genomics; Microarray; Microbial Ecology; Risk assessment; Statistical genetics; Statistical models
Pulmonary, Critical Care & Sleep Medicine: Kaminski Lab
My current research focus on developing novel statistical and computational models to analyze large scale genetic and genomic data from patients with chronic lung diseases including asthma, idiopathic pulmonary fibrosis (IPF), sarcoidosis and pediatric cystic fibrosis.
In the study on asthma collaborated with Dr. Geoffrey Chupp, we identified three subtypes of asthma or TEA clusters using gene expression data from the induced sputum and blood: those with high risk of having near-fatal asthma attacks, those with severe symptoms of asthma, and those with milder asthma. In addition, by analyzing the gene expression in the blood, we could design blood test to identify the asthma subtypes of patient to optimize the choice of treatment or drugs. Ultimately, this could lead to personalized treatment for asthma patients. A novel pathway-based clustering method was developed to achieve these results which has been compared to traditional pathway-based clustering methods to show better robustness and accuracy using both simulated data and real datasets. Currently, longitudinal induced sputum and whole blood samples are being collected from patients, which are prepared for RNA sequencing. To analyze these data, we are developing novel statistical and computational approaches to identify genetic information from the longitudinal RNA sequencing data and integrate it with the transcriptional profiles from the same data set to identify time invariant molecular endotypes of asthma.
In the study on IPF and Sarcoidosis collaborated with Dr. Naftali Kaminski, we are trying to understand the genomics and genetics of the patients. The second generation sequencing technology was used to measure both the gene expression levels and the sequence mutations in the patients. My computational team is currently working on preprocessing and analyzing these sequencing data to better understand the disease heterogeneity and pathogenesis using network analysis approaches, data integration analysis and longitudinal data analysis.
In the study on pediatric cystic fibrosis, patients provide weekly surveys and clinical visits to provide sputum and stool samples. These samples were sequenced to understand what bacteria exist, how they change over time and whether they behave differently between children with and without cystic fibrosis. My computational team is currently working on developing statistical and computational approach to analyze the longitudinal 16s rRNA sequencing data.
Extensive Research Description
Analysis of longitudinal RNA sequencing data from asthma patients;
Analysis of longitudinal gene expression data from asthma patients under bronchial thermoplasty procedures;
Analysis of longitudinal microbiome sequencing data from children with cystic fibrosis;
RNA sequencing of IPF, A1AT and SARC patients using Ion Torrent technology;
Single cell RNA sequencing data analysis;
A novel pathway-based distance score enhances assessment of disease heterogeneity in gene expression.
X. Yan, A. Liang, J. Gomez, L. Cohn, H. Zhao, G.L. Chupp (2017) A novel pathway-based distance score enhances assessment of disease heterogeneity in gene expression. BMC Bioinformatics, 18: 309.
Non-invasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma.
Yan, X., Chu, J., Gomez, J., Koenigs, M., Holm, C., He, X., Perez, M.F., Zhao, H., Mane, S., Martinez, F.D., Ober, C., Nicolae, D.L., Barnes, K.C., London, S.J., Gilliland, F., Weiss, S.T., Raby, B.A., Cohn, L., Chupp, G.L. (2015) Non-invasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma. American Journal of Respiratory, Critical Care and Sleep Medicine, 191(10):1116-25.
Modeling RNA degradation for RNA-Seq with applications.
L. Wan, X. Yan, T. Chen, F. Sun (2012) Modeling RNA degradation for RNA-Seq with applications. Biostatistics, 13(4):734-47.
Detecting functional rare variants by collapsing and incorporating functional annotation in Genetic Analysis Workshop 17 mini-exome data.
X. Yan, L. Li, JS. Lee, W. Zheng, J. Ferguson, H. Zhao (2011) Detecting functional rare variants by collapsing and incorporating functional annotation in Genetic Analysis Workshop 17 mini-exome data. BMC Proceedings, 5(Suppl 9):S27.
Dealing with high dimensionality for the identification of common and rare variants as main effects and for gene-environment interaction.
H. Bicheböller, J.J. Howing-Deuistermaat, X. Wang, X. Yan (2011) Dealing with high dimensionality for the identification of common and rare variants as main effects and for gene-environment interaction. Genetic Epidemiology, 35(Suppl 1):S35-40.
Testing gene set enrichment for subset of genes: Sub-GSE.
X. Yan, F. Sun (2008) Testing gene set enrichment for subset of genes: Sub-GSE. BMC Bioinformatics, 9:362.
Detecting differentially expressed genes by relative entropy.
X. Yan, M. Deng, W.K. Fung, M. Qian (2005) Detecting differentially expressed genes by relative entropy. Journal of Theoretical Biology, 234(3):395-402.
Full List of PubMed Publications
- Gomez JL, Yan X, Holm CT, Grant N, Liu Q, Cohn L, Nezgovorova V, Meyers DA, Bleecker ER, Crisafi GM, Jarjour NN, Rogers L, Reibman J, Chupp GL, SARP Investigators.: Characterisation of asthma subgroups associated with circulating YKL-40 levels. Eur Respir J. 2017 Oct; 2017 Oct 12. PMID: 29025889
- Zinchuk AV, Jeon S, Koo BB, Yan X, Bravata DM, Qin L, Selim BJ, Strohl KP, Redeker NS, Concato J, Yaggi HK: Polysomnographic phenotypes and their cardiovascular implications in obstructive sleep apnoea. Thorax. 2017 Sep 21; 2017 Sep 21. PMID: 28935698
- Herazo-Maya JD, Sun J, Molyneaux PL, Li Q, Villalba JA, Tzouvelekis A, Lynn H, Juan-Guardela BM, Risquez C, Osorio JC, Yan X, Michel G, Aurelien N, Lindell KO, Klesen MJ, Moffatt MF, Cookson WO, Zhang Y, Garcia JGN, Noth I, Prasse A, Bar-Joseph Z, Gibson KF, Zhao H, Herzog EL, Rosas IO, Maher TM, Kaminski N: Validation of a 52-gene risk profile for outcome prediction in patients with idiopathic pulmonary fibrosis: an international, multicentre, cohort study. Lancet Respir Med. 2017 Sep 20; 2017 Sep 20. PMID: 28942086
- Yan X, Liang A, Gomez J, Cohn L, Zhao H, Chupp GL: A novel pathway-based distance score enhances assessment of disease heterogeneity in gene expression. BMC Bioinformatics. 2017 Jun 20; 2017 Jun 20. PMID: 28637421
- Vukmirovic M, Herazo-Maya JD, Blackmon J, Skodric-Trifunovic V, Jovanovic D, Pavlovic S, Stojsic J, Zeljkovic V, Yan X, Homer R, Stefanovic B, Kaminski N: Identification and validation of differentially expressed transcripts by RNA-sequencing of formalin-fixed, paraffin-embedded (FFPE) lung tissue from patients with Idiopathic Pulmonary Fibrosis. BMC Pulm Med. 2017 Jan 12; 2017 Jan 12. PMID: 28081703
- Nezgovorova V, Liu Q, Hu B, Villalobos JL, Yan X, Niu N, Holm C, Grant NP, Marone S, Ravage-Mass L, Lee CG, Elias JA, Cohn L, Chupp GL: Sputum Gene Expression of IL-13 Receptor α2 Chain Correlates with Airflow Obstruction and Helper T-Cell Type 2 Inflammation in Asthma. Ann Am Thorac Soc. 2016 Mar. PMID: 27027964
- Yan X, Chu JH, Gomez J, Koenigs M, Holm C, He X, Perez MF, Zhao H, Mane S, Martinez FD, Ober C, Nicolae DL, Barnes KC, London SJ, Gilliland F, Weiss ST, Raby BA, Cohn L, Chupp GL: Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma. Ann Am Thorac Soc. 2016 Mar. PMID: 27027945
- Yan X, Chu JH, Gomez J, Koenigs M, Holm C, He X, Perez MF, Zhao H, Mane S, Martinez FD, Ober C, Nicolae DL, Barnes KC, London SJ, Gilliland F, Weiss ST, Raby BA, Cohn L, Chupp GL: Noninvasive analysis of the sputum transcriptome discriminates clinical phenotypes of asthma. Am J Respir Crit Care Med. 2015 May 15. PMID: 25763605
- Nish SA, Schenten D, Wunderlich FT, Pope SD, Gao Y, Hoshi N, Yu S, Yan X, Lee HK, Pasman L, Brodsky I, Yordy B, Zhao H, Brüning J, Medzhitov R: T cell-intrinsic role of IL-6 signaling in primary and memory responses. Elife. 2014 May 19; 2014 May 19. PMID: 24842874
- Brownstein CA, Beggs AH, Homer N, Merriman B, Yu TW, Flannery KC, DeChene ET, Towne MC, Savage SK, Price EN, Holm IA, Luquette LJ, Lyon E, Majzoub J, Neupert P, McCallie D Jr, Szolovits P, Willard HF, Mendelsohn NJ, Temme R, Finkel RS, Yum SW, Medne L, Sunyaev SR, Adzhubey I, Cassa CA, de Bakker PI, Duzkale H, Dworzyński P, Fairbrother W, Francioli L, Funke BH, Giovanni MA, Handsaker RE, Lage K, Lebo MS, Lek M, Leshchiner I, MacArthur DG, McLaughlin HM, Murray MF, Pers TH, Polak PP, Raychaudhuri S, Rehm HL, Soemedi R, Stitziel NO, Vestecka S, Supper J, Gugenmus C, Klocke B, Hahn A, Schubach M, Menzel M, Biskup S, Freisinger P, Deng M, Braun M, Perner S, Smith RJ, Andorf JL, Huang J, Ryckman K, Sheffield VC, Stone EM, Bair T, Black-Ziegelbein EA, Braun TA, Darbro B, DeLuca AP, Kolbe DL, Scheetz TE, Shearer AE, Sompallae R, Wang K, Bassuk AG, Edens E, Mathews K, Moore SA, Shchelochkov OA, Trapane P, Bossler A, Campbell CA, Heusel JW, Kwitek A, Maga T, Panzer K, Wassink T, Van Daele D, Azaiez H, Booth K, Meyer N, Segal MM, Williams MS, Tromp G, White P, Corsmeier D, Fitzgerald-Butt S, Herman G, Lamb-Thrush D, McBride KL, Newsom D, Pierson CR, Rakowsky AT, Maver A, Lovrečić L, Palandačić A, Peterlin B, Torkamani A, Wedell A, Huss M, Alexeyenko A, Lindvall JM, Magnusson M, Nilsson D, Stranneheim H, Taylan F, Gilissen C, Hoischen A, van Bon B, Yntema H, Nelen M, Zhang W, Sager J, Zhang L, Blair K, Kural D, Cariaso M, Lennon GG, Javed A, Agrawal S, Ng PC, Sandhu KS, Krishna S, Veeramachaneni V, Isakov O, Halperin E, Friedman E, Shomron N, Glusman G, Roach JC, Caballero J, Cox HC, Mauldin D, Ament SA, Rowen L, Richards DR, San Lucas FA, Gonzalez-Garay ML, Caskey CT, Bai Y, Huang Y, Fang F, Zhang Y, Wang Z, Barrera J, Garcia-Lobo JM, González-Lamuño D, Llorca J, Rodriguez MC, Varela I, Reese MG, De La Vega FM, Kiruluta E, Cargill M, Hart RK, Sorenson JM, Lyon GJ, Stevenson DA, Bray BE, Moore BM, Eilbeck K, Yandell M, Zhao H, Hou L, Chen X, Yan X, Chen M, Li C, Yang C, Gunel M, Li P, Kong Y, Alexander AC, Albertyn ZI, Boycott KM, Bulman DE, Gordon PM, Innes AM, Knoppers BM, Majewski J, Marshall CR, Parboosingh JS, Sawyer SL, Samuels ME, Schwartzentruber J, Kohane IS, Margulies DM: An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge. Genome Biol. 2014 Mar 25; 2014 Mar 25. PMID: 24667040
- Schenten D, Nish SA, Yu S, Yan X, Lee HK, Brodsky I, Pasman L, Yordy B, Wunderlich FT, Brüning JC, Zhao H, Medzhitov R: Signaling through the adaptor molecule MyD88 in CD4+ T cells is required to overcome suppression by regulatory T cells. Immunity. 2014 Jan 16. PMID: 24439266
- Sehgal K, Guo X, Koduru S, Shah A, Lin A, Yan X, Dhodapkar KM: Plasmacytoid dendritic cells, interferon signaling, and FcγR contribute to pathogenesis and therapeutic response in childhood immune thrombocytopenia. Sci Transl Med. 2013 Jul 10. PMID: 23843450
- Wan L, Yan X, Chen T, Sun F: Modeling RNA degradation for RNA-Seq with applications. Biostatistics. 2012 Sep; 2012 Feb 21. PMID: 22353193
- Li G, Ferguson J, Zheng W, Lee JS, Zhang X, Li L, Kang J, Yan X, Zhao H: Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data. BMC Proc. 2011 Nov 29; 2011 Nov 29. PMID: 22373389
- Yan X, Li L, Lee JS, Zheng W, Ferguson J, Zhao H: Detecting functional rare variants by collapsing and incorporating functional annotation in Genetic Analysis Workshop 17 mini-exome data. BMC Proc. 2011 Nov 29; 2011 Nov 29. PMID: 22373324
- Li L, Zheng W, Lee JS, Zhang X, Ferguson J, Yan X, Zhao H: Collapsing-based and kernel-based single-gene analyses applied to Genetic Analysis Workshop 17 mini-exome data. BMC Proc. 2011; 2011 Nov 29. PMID: 22373309
- Kang J, Zheng W, Li L, Lee JS, Yan X, Zhao H: Use of Bayesian networks to dissect the complexity of genetic disease: application to the Genetic Analysis Workshop 17 simulated data. BMC Proc. 2011 Nov 29; 2011 Nov 29. PMID: 22373110
- Lee JS, Choi M, Yan X, Lifton RP, Zhao H: On optimal pooling designs to identify rare variants through massive resequencing. Genet Epidemiol. 2011 Apr; 2011 Jan 19. PMID: 21254222
- Bickeböller H, Houwing-Duistermaat JJ, Wang X, Yan X: Dealing with high dimensionality for the identification of common and rare variants as main effects and for gene-environment interaction. Genet Epidemiol. 2011. PMID: 22128056
- Duan S, Wan L, Fu WJ, Pan H, Ding Q, Chen C, Han P, Zhu X, Du L, Liu H, Chen Y, Liu X, Yan X, Deng M, Qian M: Nonlinear cooperation of p53-ING1-induced bax expression and protein S-nitrosylation in GSNO-induced thymocyte apoptosis: a quantitative approach with cross-platform validation. Apoptosis. 2009 Feb. PMID: 19082896
- Pregizer S, Baniwal SK, Yan X, Borok Z, Frenkel B: Progressive recruitment of Runx2 to genomic targets despite decreasing expression during osteoblast differentiation. J Cell Biochem. 2008 Nov 1. PMID: 18821584
- Yan X, Sun F: Testing gene set enrichment for subset of genes: Sub-GSE. BMC Bioinformatics. 2008 Sep 2; 2008 Sep 2. PMID: 18764941
- Jia L, Berman BP, Jariwala U, Yan X, Cogan JP, Walters A, Chen T, Buchanan G, Frenkel B, Coetzee GA: Genomic androgen receptor-occupied regions with different functions, defined by histone acetylation, coregulators and transcriptional capacity. PLoS One. 2008; 2008 Nov 10. PMID: 18997859
- Cheng C, Yan X, Sun F, Li LM: Inferring activity changes of transcription factors by binding association with sorted expression profiles. BMC Bioinformatics. 2007 Nov 16; 2007 Nov 16. PMID: 18021409
- Cheng C, Ma X, Yan X, Sun F, Li LM: MARD: a new method to detect differential gene expression in treatment-control time courses. Bioinformatics. 2006 Nov 1; 2006 Aug 23. PMID: 16928738
- Yan X, Deng M, Fung WK, Qian M: Detecting differentially expressed genes by relative entropy. J Theor Biol. 2005 Jun 7; 2005 Jan 24. PMID: 15784273