Featured Publications
Characterizing Spatiotemporal Transcriptome of the Human Brain Via Low-Rank Tensor Decomposition
Liu T, Yuan M, Zhao H. Characterizing Spatiotemporal Transcriptome of the Human Brain Via Low-Rank Tensor Decomposition. Statistics In Biosciences 2022, 14: 485-513. DOI: 10.1007/s12561-021-09331-5.Peer-Reviewed Original ResearchLow-rank tensor decompositionTensor decompositionPower iterationClassical principal component analysisStatistical performanceNumerical experimentsTensor unfoldingStatistical methodsGene expression dataEfficient algorithmData matrixExpression dataTensor principal componentsBrain expression dataPrincipal component analysisIterationDecompositionSpatiotemporal transcriptomeImplicit assumptionAlgorithmDynamicsTrajectoriesGuaranteesAssumptionSpatial patternsA fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics
Zhou G, Zhao H. A fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics. PLOS Genetics 2021, 17: e1009697. PMID: 34310601, PMCID: PMC8341714, DOI: 10.1371/journal.pgen.1009697.Peer-Reviewed Original ResearchConceptsBayesian nonparametric methodParameter tuningNonparametric methodsExternal reference panelSummary statisticsComputational resourcesParallel algorithmBlock structureExplicit assumptionsExisting methodsStatisticsSeparate validation dataAccurate risk prediction modelsAssumptionPrediction modelPredictionAlgorithm
2022
A Manifold Proximal Linear Method for Sparse Spectral Clustering with Application to Single-Cell RNA Sequencing Data Analysis
Wang Z, Liu B, Chen S, Ma S, Xue L, Zhao H. A Manifold Proximal Linear Method for Sparse Spectral Clustering with Application to Single-Cell RNA Sequencing Data Analysis. INFORMS Journal On Optimization 2022, 4: 200-214. DOI: 10.1287/ijoo.2021.0064.Peer-Reviewed Original ResearchSparse spectral clusteringOptimization problemSpectral clusteringLinear methodsIteration complexity resultsNonconvex objectiveNonsmooth objectiveConvex relaxationStiefel manifoldSingle-cell RNA sequencing data setsSSC problemComplexity resultsSmoothing techniquesRNA sequencing data analysisData setsOriginal formulationUnsupervised learning methodData analysisNonsmoothProblemAlgorithmFormulationManifoldClusteringConvergence
2020
A Set of Efficient Methods to Generate High-Dimensional Binary Data With Specified Correlation Structures
Jiang W, Song S, Hou L, Zhao H. A Set of Efficient Methods to Generate High-Dimensional Binary Data With Specified Correlation Structures. The American Statistician 2020, 75: 310-322. DOI: 10.1080/00031305.2020.1816213.Peer-Reviewed Original ResearchHigh-dimensional binary dataCommon correlation structuresCorrelation structureTime complexityUnequal probabilityStatistical methodsGeneral correlation matricesCorrelated binary dataBinary dataQuadratic time complexityMonte Carlo methodCorrelation matrixData simulationLinear time complexityIncrease of dimensionCarlo methodData generationEfficient algorithmTime costValidity conditionsComplexity methodBinary variablesSimulation methodR packageAlgorithm
2019
Prediction Analysis for Microbiome Sequencing Data
Wang T, Yang C, Zhao H. Prediction Analysis for Microbiome Sequencing Data. Biometrics 2019, 75: 875-884. PMID: 30994187, DOI: 10.1111/biom.13061.Peer-Reviewed Original ResearchConceptsMonte Carlo expectation-maximization algorithmInverse regression modelReal data exampleTypes of covariatesNew statistical frameworkMaximum likelihood estimationExpectation-maximization algorithmDimension reduction structureInverse regressionStatistical frameworkData examplesStatistical challengesLikelihood estimationMicrobiome sequencing dataHuman microbiome studiesHuman microbiome compositionDifferent library sizesZerosPredictive analysisModelEstimationAlgorithmSimulationsRegression modelsFramework
2014
Low-Rank Modeling and Its Applications in Image Analysis
Zhou X, Yang C, Zhao H, Yu W. Low-Rank Modeling and Its Applications in Image Analysis. ACM Computing Surveys 2014, 47: 1-33. DOI: 10.1145/2674559.Peer-Reviewed Original ResearchLow-rank modelingLow-rank matrix recoveryExact low-rank matrix recoveryImage analysisMatrix recoveryClass of methodsComputer visionLow-rank matrixCollaborative filteringData miningArt algorithmsConvex programmingRank modelingMatrix completionNumerical experimentsAlgorithmGreat successRelated applicationsSignal processingModelingVariables of interestApplicationsMiningProgrammingMore attention
2001
Multipoint Genetic Mapping with Trisomy Data
Li J, Sherman S, Lamb N, Zhao H. Multipoint Genetic Mapping with Trisomy Data. American Journal Of Human Genetics 2001, 69: 1255-1265. PMID: 11704925, PMCID: PMC1235537, DOI: 10.1086/324578.Peer-Reviewed Original ResearchConceptsExpectation-maximization algorithmMultipoint genetic mappingAmount of computationProbability distributionTrisomy dataStatistical methodsFirst approachMarkov modelSecond approachProbabilityCrossover processComputationLarge numberSetModelApproachGeneral relationshipDistributionAlgorithmNumber of markers
2000
Assessing reliability of gene clusters from gene expression data
Zhang K, Zhao H. Assessing reliability of gene clusters from gene expression data. Functional & Integrative Genomics 2000, 1: 156-173. PMID: 11793234, DOI: 10.1007/s101420000019.Peer-Reviewed Original ResearchConceptsStatistical resampling methodsHierarchical clustering methodCluster identification methodNumerical algorithmGene expression dataClustering methodClustering treesResampling methodHierarchical clustering algorithmExpression dataExperiment designClustering algorithmAlgorithmChallenging problemData setsMeasured gene expression levelsEffect of variationData analysisClustersUncertaintyProblemReliabilityMultipoint Genetic Mapping with Uniparental Disomy Data
Zhao H, Li J, Robinson W. Multipoint Genetic Mapping with Uniparental Disomy Data. American Journal Of Human Genetics 2000, 67: 851-861. PMID: 10958760, PMCID: PMC1287890, DOI: 10.1086/303072.Peer-Reviewed Original Research