Sai Zhang
Cards
Appointments
Contact Info
About
Titles
Assistant Professor of Biomedical Informatics and Data Science
Biography
Sai Zhang received his PhD in Computer Science from Tsinghua University, where he trained in machine learning and computational biology. He completed postdoctoral training in the Department of Genetics at Stanford University School of Medicine under the supervision of Dr. Michael Snyder, gaining extensive experience in disease genetics and genomics, and was subsequently promoted to an instructor.
In 2023, Zhang launched his independent research group at the University of Florida as a tenure-track assistant professor in the Department of Epidemiology. At Yale, the Zhang Laboratory focuses on developing advanced AI and machine learning models that integrate large-scale human genetics, single-cell multiomics, and clinical data, aiming to uncover the high-resolution biology of complex diseases, guide the development of novel therapeutics, and advance precision medicine. His research is currently supported by NIGMS R35 (MIRA), the MND Association, and the Packard Center for ALS Research.
Appointments
Biomedical Informatics & Data Science
Assistant ProfessorPrimary
Other Departments & Organizations
Education & Training
- Postdoctoral Scholar
- Stanford University School of Medicine (2021)
- PhD
- Tsinghua University, Computer Science (2017)
- ME
- Tsinghua University, Computer Technology (2013)
- BE
- Nanjing University of Science & Technology, Computer Science (2010)
Research
Overview
Medical Research Interests
ORCID
0000-0001-5996-6086- View Lab Website
Zhang Lab
Research at a Glance
Publications Timeline
Research Interests
Computational Biology
Machine Learning
Genomics
Precision Medicine
Publications
Featured Publications
Single-cell polygenic risk scores dissect cellular and molecular heterogeneity of complex human diseases
Zhang S, Shu H, Zhou J, Rubin-Sigler J, Yang X, Liu Y, Cooper-Knock J, Monte E, Zhu C, Tu S, Li H, Tong M, Ecker J, Ichida J, Shen Y, Zeng J, Tsao P, Snyder M. Single-cell polygenic risk scores dissect cellular and molecular heterogeneity of complex human diseases. Nature Biotechnology 2025, 1-17. PMID: 40715455, DOI: 10.1038/s41587-025-02725-6.Peer-Reviewed Original ResearchCitationsAltmetricConceptsPolygenic risk scoresGenetic risk predictionIndividual's genetic riskSingle-cell chromatin accessibility profilesMechanistic dissectionGenetic riskPolygenic risk score approachComplex diseasesChromatin accessibility profilesComplex human diseasesCell type-specific mannerMolecular heterogeneitySingle-cell geneticsGene regulationMultiomics analysisHuman diseasesAccessibility profilesAlzheimer's diseaseCell typesDisease biologyMultiple diseasesBiologyRisk predictionRisk scoreNetwork-based frameworkModeling gene interactions in polygenic prediction via geometric deep learning
Li H, Zeng J, Snyder M, Zhang S. Modeling gene interactions in polygenic prediction via geometric deep learning. Genome Research 2024, 35: gr.279694.124. PMID: 39562137, PMCID: PMC11789630, DOI: 10.1101/gr.279694.124.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsGenetic risk predictionPolygenic risk scoresIndividual's genetic riskBiological discoveryIdentification of disease-relevant genesComplex diseasesSingle-gene resolutionGenome-wide polygenic risk scoreModel gene interactionsPolygenic risk score methodsGene-gene interactionsDisease-relevant genesComplex traitsGene interactionsPolygenic predictionGene programPRS methodsGenetic riskSystematic characterizationGenesRisk predictionPrecision medicineBiological systemsRisk scoreIntricate relationship
2025
A single-cell, long-read, isoform-resolved case-control study of FTD reveals cell-type-specific and broad splicing dysregulation in human brain
Belchikov N, Hu W, Fan L, Joglekar A, He Y, Foord C, Jarroux J, Hsu J, Pollard S, Amin S, Prjibelski A, Gong S, Zhang S, Giannelli R, Seelaar H, Tomescu A, Ross M, Li A, Grinberg L, Spina S, Miller B, Cooper-Knock J, Snyder M, Seeley W, Rao-Ruiz P, Spijker S, Smit A, Clelland C, Gan L, Tilgner H. A single-cell, long-read, isoform-resolved case-control study of FTD reveals cell-type-specific and broad splicing dysregulation in human brain. Cell Reports 2025, 44: 116198. PMID: 40913764, DOI: 10.1016/j.celrep.2025.116198.Peer-Reviewed Original ResearchAltmetricMeSH Keywords and ConceptsConceptsSplicing patternsSplicing dysregulationCell typesHigh-interest genesCell type switchingTAR DNA-binding protein 43Single-cellDNA-binding protein 43Cell type-specificLong readsDysregulation eventsSplicingTDP-43Neuronal cellsFamilial FTDSequence 2ExonFrontotemporal dementiaProtein 43CellsCase-control studyDysregulationGenesEvaluation of a biomarker for amyotrophic lateral sclerosis derived from a hypomethylated DNA signature of human motor neurons
Harvey C, Nowak A, Zhang S, Moll T, Weimer A, Barcons A, Souza C, Ferraiuolo L, Kenna K, Zaitlen N, Caggiano C, Shaw P, Snyder M, Mill J, Hannon E, Cooper-Knock J. Evaluation of a biomarker for amyotrophic lateral sclerosis derived from a hypomethylated DNA signature of human motor neurons. BMC Medical Genomics 2025, 18: 10. PMID: 39810183, PMCID: PMC11734586, DOI: 10.1186/s12920-025-02084-w.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and Concepts
2024
PRS-Net: Interpretable Polygenic Risk Scores via Geometric Learning
Li H, Zeng J, Snyder M, Zhang S. PRS-Net: Interpretable Polygenic Risk Scores via Geometric Learning. Lecture Notes In Computer Science 2024, 14758: 377-380. DOI: 10.1007/978-1-0716-3989-4_35.Peer-Reviewed Original ResearchCitationsAltmetricConceptsGenetic risk predictionPolygenic risk scoresGene-gene interactionsBiological discoveryComplex diseasesSingle-gene resolutionGenome-wide polygenic risk scorePRS methodsComplex human diseasesIdentification of genesRisk scoreRisk of Alzheimer's diseaseRisk predictionSingle-geneHuman diseasesAlzheimer's diseaseGenetic riskPrecision medicineDisease predictionBiological systemsDiscoveryGenesMultiple sclerosisScoresIntricate relationshipRare and common genetic determinants of mitochondrial function determine severity but not risk of amyotrophic lateral sclerosis
Harvey C, Weinreich M, Lee J, Shaw A, Ferraiuolo L, Mortiboys H, Zhang S, Hop P, Zwamborn R, van Eijk K, Julian T, Moll T, Iacoangeli A, Al Khleifat A, Quinn J, Pfaff A, Kõks S, Poulton J, Battle S, Arking D, Snyder M, Consortium P, Veldink J, Kenna K, Shaw P, Cooper-Knock J. Rare and common genetic determinants of mitochondrial function determine severity but not risk of amyotrophic lateral sclerosis. Heliyon 2024, 10: e24975. PMID: 38317984, PMCID: PMC10839612, DOI: 10.1016/j.heliyon.2024.e24975.Peer-Reviewed Original ResearchCitationsAltmetricConceptsMitochondrial haplotypesMitochondrial functionDownstream modifierGenetic determinantsAmyotrophic lateral sclerosisMitochondrial DNA copy numberLoss-of-function genetic variantsDNA copy numberAutosomal SNPsMitochondrial genomePatient-derived neuronsDna2 functionMeasures of mitochondrial functionGenetic variationFatal neurodegenerative diseaseGenetic variantsCopy numberGenetic measuresMendelian randomizationLateral sclerosisHaplotypesCellular vulnerabilityNeurodegenerative diseasesAmyotrophic lateral sclerosis riskDNA2
2022
Longitudinally tracking personal physiomes for precision management of childhood epilepsy
Jiang P, Gao F, Liu S, Zhang S, Zhang X, Xia Z, Zhang W, Jiang T, Zhu J, Zhang Z, Shu Q, Snyder M, Li J. Longitudinally tracking personal physiomes for precision management of childhood epilepsy. PLOS Digital Health 2022, 1: e0000161. PMID: 36812648, PMCID: PMC9931296, DOI: 10.1371/journal.pdig.0000161.Peer-Reviewed Original ResearchCitationsAltmetricConceptsDense trackingCloud computing infrastructureDetection of seizure onsetHealth management devicesMachine Learning FrameworkMobile computingComputing infrastructureLearning frameworkWearable biosensorsWearable sensorsMachine learningWearable wristbandMobile infrastructureDigital signal processingSignal processingWearableEffective health managementPersonal baselineOnset momentPhysiological irregularitiesPrecision managementMachineData pointsInfrastructureChildhood developmental stagesLow expression of EXOSC2 protects against clinical COVID-19 and impedes SARS-CoV-2 replication
Moll T, Odon V, Harvey C, Collins M, Peden A, Franklin J, Graves E, Marshall J, dos Santos Souza C, Zhang S, Castelli L, Hautbergue G, Azzouz M, Gordon D, Krogan N, Ferraiuolo L, Snyder M, Shaw P, Rehwinkel J, Cooper-Knock J. Low expression of EXOSC2 protects against clinical COVID-19 and impedes SARS-CoV-2 replication. Life Science Alliance 2022, 6: e202201449. PMID: 36241425, PMCID: PMC9585911, DOI: 10.26508/lsa.202201449.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsSARS-CoV-2 replicationRNA polymeraseGenome-wide association study statisticsGenome-wide association studiesRNA exosome componentsViral RNA polymeraseSARS-CoV-2 RNA polymeraseSARS-CoV-2 proteinsTreatment of SARS-CoV-2 viral infectionHost-virus interactionsRNA exosomeAssociation studiesProtein pulldownsHost proteinsEXOSC2Nonsense mutationSARS-CoV-2Exosome componentsReduced SARS-CoV-2 replicationClinical COVID-19LC-MS/MS analysisTargeted depletionReduced expressionCellular viabilityProtein expressionSystems analysis of de novo mutations in congenital heart diseases identified a protein network in the hypoplastic left heart syndrome
Wang Y, Zhang X, Lam C, Guo H, Wang C, Zhang S, Wu J, Snyder M, Li J. Systems analysis of de novo mutations in congenital heart diseases identified a protein network in the hypoplastic left heart syndrome. Cell Systems 2022, 13: 895-910.e4. PMID: 36167075, PMCID: PMC9671831, DOI: 10.1016/j.cels.2022.09.001.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsHypoplastic left heart syndromeCongenital heart diseaseBiological networksAnalysis of de novo mutationsPaper's transparent peer review processTransparent peer review processHeart syndromeSingle-cell transcriptomicsProtein interactomePatient exomesHeart diseaseProtein networkGenetic componentComplex diseasesFetal brain developmentProteinCardiac functionDevelopmental dynamicsMutationsClinical comorbiditiesHidden organEndothelial cellsNetwork analysisBrain developmentInteractomeDeep learning-based pseudo-mass spectrometry imaging analysis for precision medicine
Shen X, Shao W, Wang C, Liang L, Chen S, Zhang S, Rusu M, Snyder M. Deep learning-based pseudo-mass spectrometry imaging analysis for precision medicine. Briefings In Bioinformatics 2022, 23: bbac331. PMID: 35947990, DOI: 10.1093/bib/bbac331.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsLiquid chromatography-mass spectrometryMetabolite identificationChromatography-mass spectrometryPrecision medicineLC-MSProfiles of metabolismAccurate individual diagnosisSystematic profilingDiagnosisIndividual diagnosisSpectrometryDisease diagnosisDiseaseMetabolomicsMedicineLow reproducibilityReproducibility
Academic Achievements & Community Involvement
Activities
activity PLOS Computational Biology
2025 - PresentJournal ServiceGuest Editoractivity Research in Computational Molecular Biology (RECOMB)
2024 - PresentPeer Review Groups and Grant Study SectionsProgram Committeeactivity Intelligent Systems for Molecular Biology (ISMB)
2026 - PresentMeeting Planning and ParticipationArea Chairactivity Tsinghua University
2023 - 2025Public ServiceExternal Examiner of PhD Thesisactivity Motor Neurone Disease Association
2023 - 2025Peer Review Groups and Grant Study SectionsGrant Reviewer
Honors
honor Maximizing Investigators' Research Award (MIRA) for Early Stage Investigators (R35)
04/18/2025National AwardNational Institute of General Medical Sciences (NIGMS)
News
Get In Touch
Contacts
Locations
101 College Street
Academic Office
Fl 10, Rm 1021N
New Haven, CT 06510