Bioinformatic Tools for Cancer Genetics and Epidemiology

Whole-exome sequencing has created tremendous potential for revealing the genetic basis and underlying molecular mechanisms of many forms of cancer. However, somatic mutations occur at a significant frequency within tumors of most cancer types, and identification of the mutations that are on the causative trajectory from normal tissue to cancerous tissue is challenging. We are making algorithmic advances in clustering across discrete linear sequences that facilitate maximum likelihood inference of model-averaged clustering in discrete linear sequences of somatic amino acid replacement mutations appearing within mutated genes, and applying evolutionary theory to the repeated evolution of cancer in whole-exome sequence data sets to reveal the level of clonal natural selection for cancer drivers.

Biostatistical Analysis for Nonlinear Mathematical Models of the Epidemiology of Disease

I am developing probabilistic statistical methodologies for the mathematical modeling of disease emergence and spread. For diverse reasons, data for estimation of epidemiological parameters is often sparse. Evaluating a model with the “best point estimate” of sparse data may convey a misleading certitude to policy makers basing decisions on deterministic models of disease outbreak, spread, and persistence. Conversely, policy makers who are aware that models are parameterized with limited data may be dismissive of deterministic predictions that yet have significant validity. We address these issues by probabilistic sensitivity analysis of parameters and full uncertainty analysis of outcomes of interest.