Data Sciences

Data science is crucial to biomedical imaging, enabling efficient handling, analysis, and interpretation of advanced imaging data. Sophisticated algorithms and machine learning enhance image acquisition, reconstruction, motion correction, diagnostic accuracy, and the detection of subtle abnormalities that may be missed by the human eye. Data science also integrates multimodal imaging data for comprehensive views of complex diseases.

YRT-PET

The Yale Reconstruction Toolkit for Positron Emission Tomography (YRT-PET) is an open-source, fast, and modular package for PET image reconstruction, available at https://github.com/YaleBIS/yrt-pet under the MIT license. YRT-PET uses MLEM and OSEM algorithms, supports GPU acceleration with CUDA, and includes corrections for scatter, attenuation, randoms, normalization, and motion. Used in various research projects, YRT-PET also provides a full Python interface for easy integration and prototyping of novel algorithms.

Investigator: Thibault Marin

PUDA for intubation prediction in lung disease

Four examples of test CXRs and corresponding attention maps from our PUDA model.

Data-driven approaches excel in medical image analysis, but fully supervised methods need vast labeled data and struggle with new data due to domain shifts. Unsupervised domain adaptation (UDA) can resolve these issues. We introduce a prior knowledge-guided, transformer-based (PUDA) pipeline that uses anatomical and spatial prior information to regularize vision transformer attention heads, aligning data distributions across domains. This method also assigns local weights through adversarial training. Evaluated on CT and chest X-ray data for predicting intubation status, our PUDA method proves effective, with abnormal lesions as anatomical and spatial priors. Extensive experiments confirm its success.

Investigator: Jim Duncan

Automated Heart Geometry Modeling for TAVR Simulations Using Deep Learning

Automated volumetric meshing of patient-specific heart geometry accelerates biomechanics studies like post-intervention stress estimation. Previous methods often neglect key modeling characteristics, especially for thin structures like valve leaflets. We introduce DeepCarve, a deep learning method that generates high-accuracy patient-specific volumetric meshes using minimal surface mesh labels and optimizing deformation energies. Mesh generation takes 0.13 seconds per scan and is ready for finite element analysis without manual post-processing. Calcification meshes can also be included for accuracy. Stent deployment simulations validate DeepCarve's effectiveness for large-scale analyses.

Investigator: Jim Duncan

Brain Registration and Evaluation for Zebrafish (BREEZE)-mapping

Voxel-wise Z-score pERK/tERK values representing brain activity differences in zebrafish mutants of the autism-associated gene, SCN2A (scn1lab^Δ44/Δ44) versus background-matched wild-type larvae.

Through a collaboration between the Papademetris and Hoffman labs (www.hoffmanlab.net), we developed BREEZE-mapping, a pipeline for registering and analyzing whole-brain images in larval zebrafish, accessible via BioImage Suite Web (https://pmc.ncbi.nlm.nih.gov/articles/PMC10641303/pdf/main.pdf). This protocol leverages zebrafish traits like tractability, transparency, and genetic manipulability using CRISPR/Cas9 for high-throughput analysis of neuropsychiatric disorder-associated genes. BREEZE-mapping enables analysis of whole-brain phenotypes in zebrafish mutants, serving as a valuable resource for the zebrafish neuroscience community.

Investigators: Xenios Papademetris, Ellen Hoffman

Image Analysis Drives Genetic Analysis

Unsupervised longitudinal analysis of Parkinson’s Disease Progression with SPECT

Left: Typical DaTscan of a PD patient. Note the asymmetric onset of disease. Right: Disease progression trajectories of PD patients

New machine learning methods are being developed to understand and predict progression in Parkinson’s Disease. Currently, the Tagare lab focuses on understanding PD progression via DaTscan images (see figure below), clinical scores, and smart watch data, and relating this to gene expression data.

Investigator: Hemant Tagare

Image reconstruction at the Angstrom Scale

Continuous flexibility analysis of SARS-CoV-2 spike prefusion structures

Principal component analysis of the SARS-CoV-2 spike structure.

The Tagare lab is working on the development of new algorithms for understanding 3-d protein structure at atomic resolution from cryo-genic electron microscopy. During the COVID-19 pandemic his lab (in collaboration with researchers in Madrid, Spain and UT Austin) developed methods to understand the conformational changes in the pre-fusion state of the Spike Protein of SARs-Covid virus (see figure below).

Using a new consensus-based image-processing approach and principal component analysis, the flexibility and conformational dynamics of the SARS-CoV-2 spike in the prefusion state were analyzed. The study revealed continuous motions involving the receptor-binding domain (RBD) and other subdomains around the 1-RBD-up state, modeled as elastic deformations. The data set showed no stable spike conformations, only a continuum of states. An ensemble map was obtained with minimal bias, and the extremes of the variance were modeled by flexible fitting. The results highlight image-processing classification instability, affecting data interpretation.

Investigator: Hemant Tagare

SARS-CoV-2 spike prefusion structures

Watch video

MINDS Lab website

Our research group comprises academics and clinicians spanning physics, psychiatry, neuroscience, and biology, but the focus of the group is on new functional magnetic resonance imaging (fMRI) methods, primarily encompassing functional connectivity, multiscale analysis, machine learning, statistical benchmarking, and software development. Active topics of investigation include: individualized parcellation, connectivity-based predictive modeling (CPM), methodology reproducibility, and web-based software development.

Investigator: Dustin Scheinost

Connectomics

The fMRI data used to measure brain connectivity are massive and also spatially and temporally complex. The lab aims to develop novel statistical and machine learning methods to address challenges in "big" neuroscience data. Key research areas include connectome-based predictive modeling, high-dimensional connectome imputation, multi-modality manifold learning, and statistical inference to enhance power and reliability.

Early Life Imaging

Brain age gaps (BAGs) – differences between chronological age and brain age – were associated with maternal effects and toddler behaviors.

Our studies focus on the development of the brain’s functional organization in fetuses, neonates, and infants. The lab’s research in this area involves developing state-of-the-art tools and analytic pipelines to meet the challenges of imaging early brain development with advances in image acquisition, image analysis, and statistical analysis. The lab is one of a few in the country that use functional magnetic resonance imaging to examine fetal brain development.

BioImage Suite

BioImage Suite Web is an open-source web-based medical image analysis set of tools.

It was developred with support from the NIH Brain Initiative under grant R24 MH114805 (Papademetris X. and Scheinost D., PIs)

Investigators: Xenios Papademetri, Dustin Scheinost

Access the software

CWave software package

The CWave software was developed by Dr. Mason, starting in the 1990s, to facilitate the design and analysis of isotopic labeling studies with MRS. CWave models isotopic flow, including options that allow users to predict isotopomers and isotopologues (products that contain various combinations of isotopes) to analyze data from high-resolution NMR and mass spectrometry. For academic use, CWave is available free of charge upon request.

Investigator: Graeme Mason