2022
A Manifold Proximal Linear Method for Sparse Spectral Clustering with Application to Single-Cell RNA Sequencing Data Analysis
Wang Z, Liu B, Chen S, Ma S, Xue L, Zhao H. A Manifold Proximal Linear Method for Sparse Spectral Clustering with Application to Single-Cell RNA Sequencing Data Analysis. INFORMS Journal On Optimization 2022, 4: 200-214. DOI: 10.1287/ijoo.2021.0064.Peer-Reviewed Original ResearchSparse spectral clusteringOptimization problemSpectral clusteringLinear methodsIteration complexity resultsNonconvex objectiveNonsmooth objectiveConvex relaxationStiefel manifoldSingle-cell RNA sequencing data setsSSC problemComplexity resultsSmoothing techniquesRNA sequencing data analysisData setsOriginal formulationUnsupervised learning methodData analysisNonsmoothProblemAlgorithmFormulationManifoldClusteringConvergence
2013
Application of Bayesian Sparse Factor Analysis Models in Bioinformatics
Ma H, Zhao H. Application of Bayesian Sparse Factor Analysis Models in Bioinformatics. 2013, 350-365. DOI: 10.1017/cbo9781139226448.018.Peer-Reviewed Original ResearchFactor analysis modelClassical factor analysis modelLatent variable modelStatistical methodsInferential methodsVariable modelComputational biologyLarge data setsGeometrical procedureObserved variablesCorrelated variablesAnalysis modelGeneral approachLatent variablesFactor modelingLatent factorsStrong prior beliefsUnderlying structureData setsPrincipal component analysisModelVariablesRegulatory networksLarge numberPrior beliefs
2012
Time course RNA-seq: A potential avenue with somewhat different approach in tandem of differential analysis
Oh S, Zhao H, Noonan J. Time course RNA-seq: A potential avenue with somewhat different approach in tandem of differential analysis. 2012, 1: 580-587. DOI: 10.1109/cisis.2012.204.Peer-Reviewed Original ResearchMonte Carlo simulation studySimulation studyReal data setsStatistical frameworkDifferential expression methodsStatistical approachDependent dataMarkov model approachInherent dependenciesTime seriesModel approachHidden Markov Model ApproachStandard approachTime-series RNA-seq dataData setsIntuitive solutionBiological systemsTrajectory indexTemporal complexityDifferential analysisDifferent approachesApproachConsiderable advantagesSolution
2011
A permutation test approach to the choice of size k for the nearest neighbors classifier
Lai Y, Wu B, Zhao H. A permutation test approach to the choice of size k for the nearest neighbors classifier. Journal Of Applied Statistics 2011, 38: 2289-2302. DOI: 10.1080/02664763.2010.547565.Peer-Reviewed Original ResearchNearest neighbor classifierNeighbor classifierReal-world data setsCross-validation approachPrediction accuracyStatistical pattern recognitionHigh prediction accuracyMachine learningNumber of neighborsPattern recognitionMultiple sample groupsInformative featuresNumber of NNsSize k.Size kData setsClassifierPopular methodCross-validation procedureTest approachClassificationAccuracyLearningNeighborsNN
2000
Assessing reliability of gene clusters from gene expression data
Zhang K, Zhao H. Assessing reliability of gene clusters from gene expression data. Functional & Integrative Genomics 2000, 1: 156-173. PMID: 11793234, DOI: 10.1007/s101420000019.Peer-Reviewed Original ResearchConceptsStatistical resampling methodsHierarchical clustering methodCluster identification methodNumerical algorithmGene expression dataClustering methodClustering treesResampling methodHierarchical clustering algorithmExpression dataExperiment designClustering algorithmAlgorithmChallenging problemData setsMeasured gene expression levelsEffect of variationData analysisClustersUncertaintyProblemReliability