2024
Privacy-Enhancing Technologies in Biomedical Data Science
Cho H, Froelicher D, Dokmai N, Nandi A, Sadhuka S, Hong M, Berger B. Privacy-Enhancing Technologies in Biomedical Data Science. Annual Review Of Biomedical Data Science 2024, 7: 317-343. PMID: 39178425, PMCID: PMC11346580, DOI: 10.1146/annurev-biodatasci-120423-120107.Peer-Reviewed Original ResearchConceptsPrivacy-enhancing technologiesAdoption of privacy-enhancing technologiesBiomedical data scienceData scienceAnalyze sensitive dataBiomedical data repositoriesPrivacy protectionSensitive dataPrivacy concernsData silosProtect privacyHuman subject dataBiomedical domainData repositoriesPrivacySubjective dataConventional framework
2023
Reconstruction of private genomes through reference-based genotype imputation
Mosca M, Cho H. Reconstruction of private genomes through reference-based genotype imputation. Genome Biology 2023, 24: 271. PMID: 38053191, PMCID: PMC10698978, DOI: 10.1186/s13059-023-03105-6.Peer-Reviewed Original ResearchAssessing transcriptomic reidentification risks using discriminative sequence models
Sadhuka S, Fridman D, Berger B, Cho H. Assessing transcriptomic reidentification risks using discriminative sequence models. Genome Research 2023, 33: 1101-1112. PMID: 37541758, PMCID: PMC10538488, DOI: 10.1101/gr.277699.123.Peer-Reviewed Original ResearchConceptsExpression quantitative trait lociGene expression dataExpression dataQuantitative trait lociOmics data setsGene expression profilesTrait lociGenomic regionsGenetic variationGene expressionExpression profilesMolecular insightsLinkage disequilibriumFunctional impactGenotypesTranscriptomicsLociSame individualDisequilibriumSequenceExpressionPrevious studiesFull extentData sets
2021
Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption
Froelicher D, Troncoso-Pastoriza J, Raisaro J, Cuendet M, Sousa J, Cho H, Berger B, Fellay J, Hubaux J. Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption. Nature Communications 2021, 12: 5910. PMID: 34635645, PMCID: PMC8505638, DOI: 10.1038/s41467-021-25972-y.Peer-Reviewed Original ResearchConceptsMultiparty homomorphic encryptionHomomorphic encryptionPrivacy-preserving analysisNecessary key stepMultiple healthcare institutionsFederated analyticsFederated settingAnalysis tasksAnalytics systemIntermediate dataEncryptionCentralized studiesPatient dataBiomedical insightsScientific collaborationAccurate resultsIndispensable complementAnalyticsHealthcare institutionsDatasetTaskSystemBiomedical researchAccessCollaborationAssessing single-cell transcriptomic variability through density-preserving data visualization
Narayan A, Berger B, Cho H. Assessing single-cell transcriptomic variability through density-preserving data visualization. Nature Biotechnology 2021, 39: 765-774. PMID: 33462509, PMCID: PMC8195812, DOI: 10.1038/s41587-020-00801-7.Peer-Reviewed Original Research
2020
Privacy-Preserving Biomedical Database Queries with Optimal Privacy-Utility Trade-Offs
Cho H, Simmons S, Kim R, Berger B. Privacy-Preserving Biomedical Database Queries with Optimal Privacy-Utility Trade-Offs. Cell Systems 2020, 10: 408-416.e9. PMID: 32359425, DOI: 10.1016/j.cels.2020.03.006.Peer-Reviewed Original ResearchConceptsDifferential privacySensitive individual-level dataFormal privacy guaranteesQuery-answering systemPrivacy-utility tradePrivacy guaranteesQuery answersCount queriesCohort discoveryDatabase queriesUtility functionUse casesProof of optimalityResearch workflowAggregate insightsBiomedical databasesAccuracy improvementPrivate informationQueriesPrivacyGeneral utility functionDatabaseMore general utility functionsNew theoretical resultsLookup
2019
Emerging technologies towards enhancing privacy in genomic data sharing
Berger B, Cho H. Emerging technologies towards enhancing privacy in genomic data sharing. Genome Biology 2019, 20: 128. PMID: 31262363, PMCID: PMC6604426, DOI: 10.1186/s13059-019-1741-0.Commentaries, Editorials and LettersGeometric Sketching Compactly Summarizes the Single-Cell Transcriptomic Landscape
Hie B, Cho H, DeMeo B, Bryson B, Berger B. Geometric Sketching Compactly Summarizes the Single-Cell Transcriptomic Landscape. Cell Systems 2019, 8: 483-493.e7. PMID: 31176620, PMCID: PMC6597305, DOI: 10.1016/j.cels.2019.05.003.Peer-Reviewed Original ResearchConceptsSingle-cell transcriptomic landscapeSingle-cell RNA sequencing studiesSingle-cell omicsCell typesSeq data integrationSingle-cell data analysisRare cell typesRNA sequencing studiesScRNA-seq dataTranscriptional diversityTranscriptomic landscapeBiological cell typesTranscriptomic heterogeneitySequencing studiesRare subpopulationAnalysis pipelineCellsUmbilical cord bloodEssential stepInflammatory macrophagesOmicsComprehensive visualizationDiversityGeometric sketchHundreds of thousands
2018
Realizing private and practical pharmacological collaboration
Hie B, Cho H, Berger B. Realizing private and practical pharmacological collaboration. Science 2018, 362: 347-350. PMID: 30337410, PMCID: PMC6519716, DOI: 10.1126/science.aat4807.Peer-Reviewed Original ResearchConceptsArt DTI prediction methodsDrug-target interactionsDTI prediction methodsIntellectual property concernsCryptographic toolsData privacyData sharingMultiple entitiesReal datasetsOpen sharingProperty concernsPrediction methodSharingDatasetPredictive modelPrivacyProtocolConfidentialityBiomedical researchCollaborationToolDataEntitiesGeneralizable and Scalable Visualization of Single-Cell Data Using Neural Networks
Cho H, Berger B, Peng J. Generalizable and Scalable Visualization of Single-Cell Data Using Neural Networks. Cell Systems 2018, 7: 185-191.e4. PMID: 29936184, PMCID: PMC6469860, DOI: 10.1016/j.cels.2018.05.017.Peer-Reviewed Original ResearchSecure genome-wide association analysis using multiparty computation
Cho H, Wu D, Berger B. Secure genome-wide association analysis using multiparty computation. Nature Biotechnology 2018, 36: 547-551. PMID: 29734293, PMCID: PMC5990440, DOI: 10.1038/nbt.4108.Peer-Reviewed Original Research
2016
Reconstructing Causal Biological Networks through Active Learning
Cho H, Berger B, Peng J. Reconstructing Causal Biological Networks through Active Learning. PLOS ONE 2016, 11: e0150611. PMID: 26930205, PMCID: PMC4773135, DOI: 10.1371/journal.pone.0150611.Peer-Reviewed Original ResearchConceptsGaussian Bayesian networksBayesian networkContinuous Bayesian networksDiscrete Bayesian networksBiological networksFast convergenceGreat practical interestGene regulatory networksGraph structurePractical interestQuantitative propertiesSignificant runtime improvementsComplex biological systemsRuntime improvementCausal biological networksSystems biologyCentral problemFinal distributionPrevious approachesNetwork structureResource constraintsData setsLearning algorithmImportant modelActive learning algorithm
2015
Exploiting ontology graph for predicting sparsely annotated gene function
Wang S, Cho H, Zhai C, Berger B, Peng J. Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics 2015, 31: i357-i364. PMID: 26072504, PMCID: PMC4542782, DOI: 10.1093/bioinformatics/btv260.Peer-Reviewed Original ResearchConceptsFunction prediction algorithmsPrediction algorithmVariety of algorithmsFunctional labelsOntology graphCross-validation experimentsOverfitting problemGraph structurePrevious stateGene functionAlgorithmAnnotation catalogsTens of thousandsMolecular interaction networksFunction predictionOntology databasePoor predictive performanceAnnotationLabelsPredictive performanceGene Ontology databaseInformationLarge numberGO termsInteraction networks
2014
High-Resolution Transcriptome Analysis with Long-Read RNA Sequencing
Cho H, Davis J, Li X, Smith K, Battle A, Montgomery S. High-Resolution Transcriptome Analysis with Long-Read RNA Sequencing. PLOS ONE 2014, 9: e108095. PMID: 25251678, PMCID: PMC4176000, DOI: 10.1371/journal.pone.0108095.Peer-Reviewed Original ResearchConceptsAllele-specific expressionTranscriptome analysisRNA sequencingAlternative splicing patternsCell line GM12878RNA-seq protocolsSequencing of cDNARNA-seq datasetsIndividual transcriptomesGenomic elementsAlternative splicingSplicing patternsAllelic expressionTranscript quantificationRead lengthMRNA transcriptsMapping biasSequencingExpressionTranscriptomeGM12878SplicingTechnical hurdlesCDNAGenes