Featured Publications
SANTO: a coarse-to-fine alignment and stitching method for spatial omics
Li H, Lin Y, He W, Han W, Xu X, Xu C, Gao E, Zhao H, Gao X. SANTO: a coarse-to-fine alignment and stitching method for spatial omics. Nature Communications 2024, 15: 6048. PMID: 39025895, PMCID: PMC11258319, DOI: 10.1038/s41467-024-50308-x.Peer-Reviewed Original ResearchSCADIE: simultaneous estimation of cell type proportions and cell type-specific gene expressions using SCAD-based iterative estimating procedure
Tang D, Park S, Zhao H. SCADIE: simultaneous estimation of cell type proportions and cell type-specific gene expressions using SCAD-based iterative estimating procedure. Genome Biology 2022, 23: 129. PMID: 35706040, PMCID: PMC9199219, DOI: 10.1186/s13059-022-02688-w.Peer-Reviewed Original ResearchMeSH KeywordsAlgorithmsGene ExpressionGene Expression ProfilingSequence Analysis, RNASingle-Cell AnalysisConceptsCell type-specific gene expressionType-specific gene expressionCell type proportionsDifferential expression analysisCell type-specific gene expression profilesExpression analysisGene expressionSingle-cell RNA-seq dataRNA-seq dataGene differential expression analysisGene expression profilesType proportionsExpression profilesExpressionGenesCellsM-DATA: A statistical approach to jointly analyzing de novo mutations for multiple traits
Xie Y, Li M, Dong W, Jiang W, Zhao H. M-DATA: A statistical approach to jointly analyzing de novo mutations for multiple traits. PLOS Genetics 2021, 17: e1009849. PMID: 34735430, PMCID: PMC8568192, DOI: 10.1371/journal.pgen.1009849.Peer-Reviewed Original ResearchA fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics
Zhou G, Zhao H. A fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics. PLOS Genetics 2021, 17: e1009697. PMID: 34310601, PMCID: PMC8341714, DOI: 10.1371/journal.pgen.1009697.Peer-Reviewed Original ResearchConceptsBayesian nonparametric methodParameter tuningNonparametric methodsExternal reference panelSummary statisticsComputational resourcesParallel algorithmBlock structureExplicit assumptionsExisting methodsStatisticsSeparate validation dataAccurate risk prediction modelsAssumptionPrediction modelPredictionAlgorithm
2019
NITUMID: Nonnegative matrix factorization-based Immune-TUmor MIcroenvironment Deconvolution
Tang D, Park S, Zhao H. NITUMID: Nonnegative matrix factorization-based Immune-TUmor MIcroenvironment Deconvolution. Bioinformatics 2019, 36: 1344-1350. PMID: 31593244, PMCID: PMC8215918, DOI: 10.1093/bioinformatics/btz748.Peer-Reviewed Original ResearchHarmonizing Genetic Ancestry and Self-identified Race/Ethnicity in Genome-wide Association Studies
Fang H, Hui Q, Lynch J, Honerlaw J, Assimes T, Huang J, Vujkovic M, Damrauer S, Pyarajan S, Gaziano J, DuVall S, O’Donnell C, Cho K, Chang K, Wilson P, Tsao P, Sun Y, Tang H, Gaziano J, Ramoni R, Breeling J, Chang K, Huang G, Muralidhar S, O’Donnell C, Tsao P, Muralidhar S, Moser J, Whitbourne S, Brewer J, Concato J, Warren S, Argyres D, Stephens B, Brophy M, Humphries D, Do N, Shayan S, Nguyen X, Pyarajan S, Cho K, Hauser E, Sun Y, Zhao H, Wilson P, McArdle R, Dellitalia L, Harley J, Whittle J, Beckham J, Wells J, Gutierrez S, Gibson G, Kaminsky L, Villareal G, Kinlay S, Xu J, Hamner M, Haddock K, Bhushan S, Iruvanti P, Godschalk M, Ballas Z, Buford M, Mastorides S, Klein J, Ratcliffe N, Florez H, Swann A, Murdoch M, Sriram P, Yeh S, Washburn R, Jhala D, Aguayo S, Cohen D, Sharma S, Callaghan J, Oursler K, Whooley M, Ahuja S, Gutierrez A, Schifman R, Greco J, Rauchman M, Servatius R, Oehlert M, Wallbom A, Fernando R, Morgan T, Stapley T, Sherman S, Anderson G, Sonel E, Boyko E, Meyer L, Gupta S, Fayad J, Hung A, Lichy J, Hurley R, Robey B, Striker R. Harmonizing Genetic Ancestry and Self-identified Race/Ethnicity in Genome-wide Association Studies. American Journal Of Human Genetics 2019, 105: 763-772. PMID: 31564439, PMCID: PMC6817526, DOI: 10.1016/j.ajhg.2019.08.012.Peer-Reviewed Original ResearchPrediction Analysis for Microbiome Sequencing Data
Wang T, Yang C, Zhao H. Prediction Analysis for Microbiome Sequencing Data. Biometrics 2019, 75: 875-884. PMID: 30994187, DOI: 10.1111/biom.13061.Peer-Reviewed Original ResearchConceptsMonte Carlo expectation-maximization algorithmInverse regression modelReal data exampleTypes of covariatesNew statistical frameworkMaximum likelihood estimationExpectation-maximization algorithmDimension reduction structureInverse regressionStatistical frameworkData examplesStatistical challengesLikelihood estimationMicrobiome sequencing dataHuman microbiome studiesHuman microbiome compositionDifferent library sizesZerosPredictive analysisModelEstimationAlgorithmSimulationsRegression modelsFramework
2017
Network Clustering Analysis Using Mixture Exponential-Family Random Graph Models and Its Application in Genetic Interaction Data
Wang Y, Fang H, Yang D, Zhao H, Deng M. Network Clustering Analysis Using Mixture Exponential-Family Random Graph Models and Its Application in Genetic Interaction Data. IEEE/ACM Transactions On Computational Biology And Bioinformatics 2017, 16: 1743-1752. PMID: 28858811, DOI: 10.1109/tcbb.2017.2743711.Peer-Reviewed Original ResearchConceptsExponential-family random graph modelsRandom graph modelsGraph modelStatistical network modelsHeterogeneity of networksLarge-scale genetic interaction networksReal social networksERGM parametersSubset of nodesOnline graphStatistical modelData sizeObserved networkEM algorithmNetwork informationGraph nodesMixture problemSocial networksFlexible wayNetwork modelNetwork clustersClassical methodsIncredible setInteraction dataNetworkOn Joint Estimation of Gaussian Graphical Models for Spatial and Temporal Data
Lin Z, Wang T, Yang C, Zhao H. On Joint Estimation of Gaussian Graphical Models for Spatial and Temporal Data. Biometrics 2017, 73: 769-779. PMID: 28099997, PMCID: PMC5515703, DOI: 10.1111/biom.12650.Peer-Reviewed Original ResearchConceptsGaussian graphical modelsTemporal dataGraphical modelsComplex data structuresJoint estimationMarkov random field modelRandom field modelParallel computingSelection consistencyData structureStatistical inferenceNeighborhood selection methodTemporal dependenciesEfficient algorithmIndividual networksMultiple groupsSpatial dataModel convergesNetwork estimationField modelSelection methodNetworkPosterior probabilitySimulation studyImproved estimation
2016
CCor: A Whole Genome Network-Based Similarity Measure Between Two Genes
Hu Y, Zhao H. CCor: A Whole Genome Network-Based Similarity Measure Between Two Genes. Biometrics 2016, 72: 1216-1225. PMID: 26953524, PMCID: PMC5016231, DOI: 10.1111/biom.12508.Peer-Reviewed Original Research
2013
Guilt by rewiring: gene prioritization through network rewiring in Genome Wide Association Studies
Hou L, Chen M, Zhang CK, Cho J, Zhao H. Guilt by rewiring: gene prioritization through network rewiring in Genome Wide Association Studies. Human Molecular Genetics 2013, 23: 2780-2790. PMID: 24381306, PMCID: PMC3990172, DOI: 10.1093/hmg/ddt668.Peer-Reviewed Original ResearchConceptsGenome-wide association studiesWide association studyDisease-associated genesGWAS signalsNetwork rewiringAssociation studiesFunctional genomic informationGene expression networksCo-expression networkDisease-associated pathwaysExpression networksGene networksGenomic informationAssociation signalsGene prioritizationDisease genesDisease locusSusceptibility lociGenesAssociation principleRewiringDisease associationsLociMillions of candidatesDisease conditions
2012
iFad: an integrative factor analysis model for drug-pathway association inference†
Ma H, Zhao H. iFad: an integrative factor analysis model for drug-pathway association inference†. Bioinformatics 2012, 28: 1911-1918. PMID: 22581178, PMCID: PMC3389771, DOI: 10.1093/bioinformatics/bts285.Peer-Reviewed Original Research
2011
Incorporating Biological Pathways via a Markov Random Field Model in Genome-Wide Association Studies
Chen M, Cho J, Zhao H. Incorporating Biological Pathways via a Markov Random Field Model in Genome-Wide Association Studies. PLOS Genetics 2011, 7: e1001353. PMID: 21490723, PMCID: PMC3072362, DOI: 10.1371/journal.pgen.1001353.Peer-Reviewed Original ResearchConceptsGenome-wide association studiesAssociation studiesBiological pathwaysSingle gene-based methodsMarkov random field modelGene-based methodsPrior biological knowledgeRandom field modelGWAS analysisAssociation signalsMultiple genesPathway topologyGene associationsAssociation analysisGenesBiological knowledgeField modelGenetic variantsSpecific pathwaysReal data examplePathwayStatistical inferenceConditional modes algorithmExchangeable setRegression form
2001
Multipoint Genetic Mapping with Trisomy Data
Li J, Sherman S, Lamb N, Zhao H. Multipoint Genetic Mapping with Trisomy Data. American Journal Of Human Genetics 2001, 69: 1255-1265. PMID: 11704925, PMCID: PMC1235537, DOI: 10.1086/324578.Peer-Reviewed Original ResearchConceptsExpectation-maximization algorithmMultipoint genetic mappingAmount of computationProbability distributionTrisomy dataStatistical methodsFirst approachMarkov modelSecond approachProbabilityCrossover processComputationLarge numberSetModelApproachGeneral relationshipDistributionAlgorithmNumber of markersComparisons of Two Methods for Haplotype Reconstruction and Haplotype Frequency Estimation from Population Data
Zhang S, Pakstis A, Kidd K, Zhao H. Comparisons of Two Methods for Haplotype Reconstruction and Haplotype Frequency Estimation from Population Data. American Journal Of Human Genetics 2001, 69: 906-912. PMID: 11536083, PMCID: PMC1226079, DOI: 10.1086/323622.Peer-Reviewed Original Research
2000
Assessing reliability of gene clusters from gene expression data
Zhang K, Zhao H. Assessing reliability of gene clusters from gene expression data. Functional & Integrative Genomics 2000, 1: 156-173. PMID: 11793234, DOI: 10.1007/s101420000019.Peer-Reviewed Original ResearchConceptsStatistical resampling methodsHierarchical clustering methodCluster identification methodNumerical algorithmGene expression dataClustering methodClustering treesResampling methodHierarchical clustering algorithmExpression dataExperiment designClustering algorithmAlgorithmChallenging problemData setsMeasured gene expression levelsEffect of variationData analysisClustersUncertaintyProblemReliabilityMultipoint Genetic Mapping with Uniparental Disomy Data
Zhao H, Li J, Robinson W. Multipoint Genetic Mapping with Uniparental Disomy Data. American Journal Of Human Genetics 2000, 67: 851-861. PMID: 10958760, PMCID: PMC1287890, DOI: 10.1086/303072.Peer-Reviewed Original Research