Skip to Main Content
Yale Only

YSPH Biostatistics Seminar: “Flusion: Integrating multiple data sources for accurate influenza predictions"

NOTE: BIS 525 students are required to attend in person. Others are invited to attend in person, but may also attend via Zoom.

SPEAKER: Evan L. Ray, PhD, Research Assistant Professor in Biostatistics, University of Massachusetts, Amherst

TITLE: “Flusion: Integrating multiple data sources for accurate influenza predictions

ABSTRACT: Over the last ten years, the US Centers for Disease Control and Prevention (CDC) has organized an annual influenza forecasting challenge with the motivation that accurate probabilistic forecasts could improve situational awareness and yield more effective public health actions. Starting with the 2021/22 influenza season, the forecasting targets for this challenge have been based on hospital admissions reported in the CDC’s National Healthcare Safety Network (NHSN) surveillance system. Reporting of hospital admissions through NHSN began during the COVID pandemic, and as such NHSN has only a limited amount of historical data about influenza hospitalizations. To produce forecasts in the presence of limited data for the target surveillance system, we augmented these data with two signals that have a longer historical record: 1) ILI+, which estimates the proportion of outpatient doctor visits where the patient has influenza; and 2) rates of laboratory-confirmed influenza hospitalizations at a set of sentinel healthcare facilities. Our model, Flusion, is an ensemble model that combines a Bayesian autoregressive model with two machine learning models using gradient boosting for quantile regression based on different feature sets. In each week of the influenza season, these models produced quantiles of a predictive distribution for influenza hospital admissions in each state in the current week and the following three weeks; the ensemble prediction was computed by averaging these predictions. Flusion emerged as the top-performing model in the CDC's influenza prediction challenge for the 2023/24 season. In this talk, we investigate the factors contributing to its success and show that joint training on data from multiple locations and data sources was critical. These results indicate a path forward for modeling new data streams that may replace long-standing surveillance data systems as a part of public health data modernization initiatives.

YSPH values inclusion and access for all participants. If you have questions about accessibility or would like to request an accommodation, please contact Charmila Fernandes at Charmila.fernandes@yale.edu. We will try to provide accommodations requested by November 7, 2024.



Speaker

  • University of Massachusetts, Amherst

    Evan L. Ray, PhD
    Research Assistant Professor in Biostatistics

Contact

Host Organizations

Admission

Free

Tag

Lectures and Seminars
Nov 202412Tuesday