Jeffrey Townsend, PhD

he/him/his

Elihu Professor of Biostatistics and Professor of Ecology and Evolutionary Biology

DownloadHi-Res Photo

Additional Titles

Co-Leader, Genomics, Genetics, & Epigenetics Research Program

Contact Info

jeffrey.townsend@yale.edu

203.737.7042

Biostatistics

135 College St, Room 222

New Haven, CT 06510-2483

United States

About

Titles

Elihu Professor of Biostatistics and Professor of Ecology and Evolutionary Biology

Co-Leader, Genomics, Genetics, & Epigenetics Research Program

Biography

Professor Townsend received his Ph.D. in 2002 in organismic and evolutionary biology from Harvard University, under the advisement of Daniel Hartl. His Ph.D. was entitled "Population genetic variation in genome-wide gene expression: modeling, measurement, and analysis", and constituted the first population genetic analysis of genome-wide gene expression variation. After making use of the model budding yeast S. cerevisiae for his Ph.D. research, Dr. Townsend accepted an appointment as a Miller Fellow at the University of California-Berkeley in the Department of Plant and Microbial Biology, where he worked to develop molecular tools, techniques, and analysis methodologies for functional genomics studies with the filamentous fungal model species Neurospora crassa, co-advised by Berkeley fungal evolutionary biologist John Taylor and molecular mycologist Louise Glass. In 2004, he accepted his first appointment as an Assistant Professor in the Department of Molecular and Cell Biology at the University of Connecticut. In 2006 he was appointed as an Assistant Professor the Department of Ecology and Evolutionary Biology at Yale University. In 2013 he began to work on statistical approaches to fit mathematical models of disease spread and emergence, and to work on the somatic evolution of cancer, and was appointed as an Associate Professor of Biostatistics and Ecology & Evolutionary Biology. In 2017 he was named Elihu Associate Professor of Biostatistics and Ecology & Evolutionary Biology, and in 2018 he was appointed Elihu Professor of Biostatistics and Ecology & Evolutionary Biology. In 2019 he was appointed a member of the Connecticut Academy of Science and Engineering, in recognition of the development of innovative approaches to population biology, including the evolution of antimicrobial resistance, disease evolution and transmission, and evolution of tumorigenesis; and research that has enabled curtailment of pathogen evolution, outbreak mitigation, and informed therapeutic approaches to cancer metastasis and evolution of therapeutic resistance in cancer. In 2021 he was selected as the Co-Chair-Elect of the Cancer Evolution Working Group of the American Association for Cancer Research. In 2022 he was appointed Co-Director of the Genetics, Genomics, and Epigenetics Program of the Yale Cancer Center. In 2023 he was elevated to Co-Chair of the Cancer Evolution Working Group of the American Association for Cancer Research.

Appointments

Biostatistics
Professor
Primary
Biostatistics
Department of Ecology & Evolutionary Biology
Professor
Secondary
Department of Ecology & Evolutionary Biology

Biostatistics
Biostatistics
Center for Biomedical Data Science
Climate Change and Health
Computational Biology and Biomedical Informatics
Computational Biology and Bioinformatics
Department of Ecology & Evolutionary Biology
Genomics, Genetics, and Epigenetics
Microbiology
Public Health Modeling
Townsend Lab
Yale Cancer Center
Yale Center for Immuno-Oncology
Yale Combined Program in the Biological and Biomedical Sciences (BBS)
Yale School of Public Health
Yale School of Public Health - NEW

Education & Training

Miller Postdoctoral Fellw: University of California, Berkeley (2004)

PhD: Harvard University, Organismic & Evolutionary Biology (2002)

ScB: Brown University, Biology (1994)

Research

Overview

1. TOOLS FOR CANCER GENETICS AND EPIDEMIOLOGY

Whole-exome sequencing has created tremendous potential for revealing the genetic basis and underlying molecular mechanisms of many forms of cancer. However, somatic mutations occur at a significant frequency within tumors of most cancer types, and identification of the mutations that are on the causative trajectory from normal tissue to cancerous tissue is challenging. We are making algorithmic advances in clustering across discrete linear sequences to enact two powerful approaches to this identification. First, we are applying maximum likelihood approaches that we have developed for model-averaged clustering in discrete linear sequences to somatic amino acid replacement mutations appearing within mutated genes. Because amino acids of proteins that are functionally important are locally clustered in domains, mutations in multiple tumors that are functionally important to the development of cancer cluster in the linear sequence of relevant genes, allowing inference of relevance and function even in cases without three-dimensional protein structure. These clustering analyses have the power to demonstrate, for instance, cross-cancer consistency in the functional importance of the DNA binding domain of tumor suppressor p53, whether in a cancer with extensive exome data (ovarian serous adenocarcinoma) or in a cancer with much less extensive exome data (e.g. rectal adenocarcinoma).

Second, we are applying evolutionary theory to the problem of identification of the genetic architecture of underlying cancer development. The path from normal to cancerous tissue is navigated by an evolutionary process. Tools from evolutionary theory have the potential to parse those mutations that are selected within cells on the path to cancer from those mutations that arise incidentally during the somatic evolution of cancer. The theory we are applying makes use of differences in expectation for synonymous and replacement mutations. Synonymous mutations are expected to have no functional impact; thus they yield a proxy expectation for the “incidental” mutations, whereas carcinogenic replacement mutations will spread within tumors more frequently and are clustered within gene sequence. Our theory also employs human population polymorphism data, which most evolutionary biologists believe can be largely assumed to be neutral. This data facilitates calibration of the probable impact of replacement changes to sequence conservation by eliminating the confounding variable of the degree of purifying selection, which decreases the number of mutations observed in some genes and allows others to accumulate many mutations with little impact.

We are extending this approach to estimating selection intensity on mutations along the trajectory toward cancer, revealing the level of selection within tumors for replacement mutations compared to synonymous mutations. This evolutionary analysis is ideal for detecting the history of selection on sites within genes during the evolution of cancer from exome sequencing data. These sites, particularly when representing gain-of-function mutations, will help identify candidate loci for pharmacological intervention. This approach will be applied to identify targets for pharmacological intervention and design “personal genomics” drugs appropriate for the genetics of individual cancers in individual patients. As a component of that project, we are constructing an “active-experiment” cancer exome database to facilitate further bioinformatics investigation of cancer exome data.

2. BIOSTATISTICAL ANALYSIS FOR NONLINEAR MATHEMATICAL MODELS OF THE EPIDEMIOLOGY OF DISEASE

I am developing probabilistic statistical methodologies for the mathematical modeling of disease emergence and spread. Robustness of models has usually been assessed by techniques that explore the relative impact and importance of parameters upon the mathematical behavior of the function and the mathematical predictions of the model. For diverse reasons including the difficulty or cost of acquisition, restrictions due to privacy, and urgency of analysis in the case of outbreaks, data for estimation of epidemiological parameters is often sparse. Evaluating a model with the “best point estimate” of sparse data may convey a misleading certitude to policy makers basing decisions on deterministic models of disease outbreak, spread, and persistence. Conversely, policy makers who are aware that models are parameterized with limited data may be dismissive of deterministic predictions that yet have significant validity. These issues may be most straightforwardly addressed by probabilistic sensitivity analysis of parameters and full uncertainty analysis of outcomes of interest. These analyses amount to accommodating the uncertainty of parameters directly into an analysis by probabilistically resampling data or likely distributions of parameters to calculate a probabilistic distribution of outcomes.

For instance, one of the most common modeling approaches for evaluating interventions is based on differential equation models of disease such as the standard Susceptible-Infected-Recovered (SIR) model. In the SIR model and other more complex constructions, a closed-form solution can often be calculated for the basic reproductive number, R₀, the average number of secondary infections that would follow upon a primary infection in a naïve host population. In a population where there is preexisting immunity due to either vaccination or previous infection, the effective reproductive number, R_e, is defined as the average number of secondary infections following a primary infection in a population that is not completely naïve.

is of particular interest in public health because interventions that bring its value below 1 are predicted to eradicate the disease. This deterministic threshold of is proposed as the basis for policy decisions regarding the level of interventions that should be implemented. However, the best estimates for the parameters that are needed for the closed-form solution of are inevitably inexact. To address this point, sensitivity analyses are frequently performed to evaluate models and explore the relationship between model parameters and outcomes. In such deterministic sensitivity analyses, one or more parameters are perturbed and the corresponding effects on outcomes are examined. The perturbation can be done either by evaluating the effect of arbitrarily small changes in parameter values (e.g. ± 1%) or by evaluating the effects across a range of values defined by plausible probability density functions. Because the values of other parameters are held fixed at best point estimates, these strategies do not account for interaction effects in non-linear dynamic models, and do not assess global uncertainty in outcome. Uncertainty analysis has been recommended for many fields of mathematical modeling, including medical decision making, as an optimal approach to presenting models. In the case of dynamic transmission modeling, however, authoritative best practices have not included uncertainty analyses. Modeling guidelines recommend probabilistic sensitivity analysis, in which both global parameter uncertainty and output uncertainty are addressed, as the best practice method for uncertainty analysis. Yet that ideal has not been extended to dynamic transmission models, for which its implementation has been challenging.

We are developing methods for global probabilistic sensitivity analysis that allow the contribution of each parameter to model outcomes to be investigated while also taking into account the uncertainty of other model parameters. Uncertainty in parameter values can be accounted for by sampling randomly from empirical data or from probability density functions fit to empirical data. Depending on the instance, such sampling techniques include bootstrapping, Monte Carlo sampling, and Latin hypercube sampling. The model output generated from parameter samples can then be analyzed using linear (e.g. partial correlation coefficients), monotonic (e.g. partial rank correlation coefficients) and non-monotonic statistical tests (e.g. sensitivity index) to determine the contribution of each parameter to the variation in output values. Indeed, for a global sensitivity analysis to yield probabilities associated with outcomes that are of greatest utility to policy makers, probabilistic analyses of parameter uncertainty must be carried through to the model outcomes. For example, the probability of eradication of an epidemic is sensitive to both levels of vaccination and treatment. Moreover, a policy based on the analysis of data should take into consideration not only the best estimate of necessary action, but also the uncertainty around that outcome estimate. The former policy advice, indicating an exact cline of treatment and vaccination that should put into abeyance an influenza epidemic, is very different and can be misleading compared to the probabilistic statement, which gives a policymaker a predictive probability that a particular policy of treatment and vaccination will put into abeyance an influenza epidemic. Similar approaches applied with a next-generation matrix to rabies vaccination in Tanzania were able to demonstrate that WHO goals in two districts of 70% vaccination coverage of dogs had more than enough probability to control rabies, if only the process to achieve those not impractical goals could be mustered.

A public health decision maker would find most useful the assignment of the probability of eradication to each level of treatment, so that they may precisely weigh the cost of intervention against the potential for failure. These probabilistic outcome distributions also feed forward extremely fluidly with cost-effectiveness estimation, a field which has embraced uncertainty analysis but which has until our recent work not incorporated uncertainty from nonlinear infectious disease models into calculations.

We have many projects ongoing in the lab, covering topics summarized below, including many we have already published on and many that we have not. In particular, we have a lot of projects on the somatic evolution of cancer that are not yet in publications.

Medical Research Interests

Algorithms; Bacteria; Bacterial Infections and Mycoses; Beer; Biological Evolution; Bread; Cell Transformation, Neoplastic; Coccidioidomycosis; Computing Methodologies; Crops, Agricultural; Evolution, Molecular; Fungi; Gene Expression Profiling; Gene Transfer Techniques; Genetic Engineering; Genetic Phenomena; Genetic Speciation; Host-Pathogen Interactions; Likelihood Functions; Logistic Models; Mathematical Concepts; Microarray Analysis; Microbiological Phenomena; Models, Genetic; Models, Statistical; Models, Theoretical; Molecular Epidemiology; Mycoses; Nature; Neoplasm Metastasis; Neoplasms; Nonlinear Dynamics; Organisms; Phenomena and Processes; Phylogeny; Polymerase Chain Reaction; Public Health Informatics; Sequence Analysis, DNA; Sequence Analysis, Protein; Viruses; Wine

Public Health Interests

Antimicrobial Resistance; Bioinformatics; Cancer; COVID-19; Evolution; Genetics, Genomics, Epigenetics; Hepatitis; HIV/AIDS; Infectious Diseases; Health Policy; Influenza; Metabolism; Microbial Ecology; Modeling; Pollution; Tick-borne Diseases; Vaccines; Zoonotic Diseases

ORCID
0000-0002-9890-3907
Townsend Lab
View Lab Website

Research at a Glance

Yale Co-Authors

Frequent collaborators of Jeffrey Townsend's published research.

Publications

Featured Publications

See All Publications

Academic Achievements & Community Involvement

activity
Molecular Biology and Evolution
Journal ServiceAssociate Editor
Details
activity
Human Genomics
Journal ServiceAssociate Editor
Details
activity
Frontiers in Ecology & Evolution
Journal ServiceAssociate Editor
Details
activity
BMC Evolutionary Biology
Journal ServiceAssociate Editor
Details
activity
Cancer Prevention Research
Journal ServiceAssociate Editor
Details

See All Achievements

News & Links

Media

Spotted DNA microarray. Red spots represent genes abundantly expressed in wine yeast growing in a high concentration of copper sulfate. Green spots represent genes expressed abundantly in wine yeast growing at a low concentration of copper sulfate. Copper sulfate is often applied in vineyards to control growth of fungi.
Information on the Tree of Life. Recent efforts to reveal the evolutionary history of life on earth have increasingly relied on the sequencing of DNA from multiple species for multiple genes. This figure demonstrates a principle that should guide these efforts: to understand deep divergences, sample taxa that diverge deeply first. a) and b) Curves depict the cumulative support for the bold deep internode of four species (the fungi Yarrowia lipolytica, Saccharomyces cerevisiae, Coccidioides immitis, and Neurospora crassa), ranging from zero to complete sampling for several sampling schemes: the outcome based on perfect and worst-possible performance (dashed); outcome based on prioritizing sampling based on an novel theoretical prediction using rate of evolution of the sequences (solid); outcome based on prioritizing sampling of all genes for the deepest ingroup (dash-dotted); expectation for haphazard sampling (dotted). c) The established chronogram, or time tree, of the evolution of these species. Vertical bars in the plots correspond to switches from sampling characters from deeper-branching to sampling characters from shallower-branching taxa; note that the slope of the increase in cumulative information (red and green curves) declines as sequences are sampled from more recently diverged lineages in the tree, and that this pattern of high utility to sampling the deepest lineages is revealed for both the clade in panel a and the clade in panel b.
Population genetic modeling of HGT suggests several key quantities are important to designing any sampling-based assay of horizontal gene transfer (HGT) in large populations. The HGT rate r and the exposed fraction X play significant but ultimately minor roles in the population dynamics, most likely impacting only the number of original opportunities for horizontal spread of genetic material. The malthusian selection coefficient m of the transferred genetic material and the time in recipient generations t from exposure play key, non-linear roles in determining the potential for detection of HGT. Sample size n is important, but frequently the practical sample sizes to be obtained are many orders of magnitude below the extant population size. It is therefore essential to wait until natural selection has had time to operate, so that it is essential to wait until natural selection has a chance to operate to have any chance of effectively detecting horizontal gene transfer events.

News

See All News

Related Links

Get In Touch

Contacts

jeffrey.townsend@yale.edu

Academic Office Number

203.737.7042

Lab Number

203.785.6800

Office Fax Number

203.432.5176

Mailing Address

Biostatistics

135 College St, Room 222

New Haven, CT 06510-2483

United States

Administrative Support

Locations

135 College Street
Academic Office
Fl Floor 2, Ste Suite 200, Rm Room 222
New Haven, CT 06510
Appointments
203.891.6348
General Information
203.891.6348
Get Directions
60 College Street
Lab
Ste 7th Floor
New Haven, CT 06510
Get Directions

Additional Titles

Contact Info

Titles

Biography

Appointments

Biostatistics

Department of Ecology & Evolutionary Biology

Other Departments & Organizations

Education & Training

Overview

Medical Research Interests

Public Health Interests

ORCID

Townsend Lab

Research at a Glance

Yale Co-Authors

Francesc Lopez-Giraldez, PhD

Zheng Wang, PhD

Abhishek Pandey, PhD

David Rimm, MD, PhD

Frederick Lewis Altice, MD, MA

Alison Galvani, PhD

Publications

Featured Publications

Molecular Biology and Evolution

Human Genomics

Frontiers in Ecology & Evolution

BMC Evolutionary Biology

Cancer Prevention Research

Media

0473439

Fig7_19March2010

Screen Shot 2014-01-07 at 5.18.50 AM

News

Yale, UNC study looks at timing for COVID-19 booster. Here’s what they found.

One size doesn’t fit all: Best time for COVID-19 booster depends on where you live, infection history

Dr. Townsend on the Future Analysis of Prostate Cancer Driver Mutations Using Neoplasm Tissue Samples

Chemotherapy Before Surgery Benefits Some Patients With Pancreatic Cancer

Related Links

Contacts

Administrative Support

Locations

135 College Street

60 College Street