2023
Simulating complex patient populations with hierarchical learning effects to support methods development for post-market surveillance
Davis S, Ssemaganda H, Koola J, Mao J, Westerman D, Speroff T, Govindarajulu U, Ramsay C, Sedrakyan A, Ohno-Machado L, Resnic F, Matheny M. Simulating complex patient populations with hierarchical learning effects to support methods development for post-market surveillance. BMC Medical Research Methodology 2023, 23: 89. PMID: 37041457, PMCID: PMC10088292, DOI: 10.1186/s12874-023-01913-9.Peer-Reviewed Original ResearchConceptsSynthetic datasetsData characteristicsFeature distributionGround truthMIMIC-III dataReal-world dataData generation processComplex simulation studiesData relationshipsUser definitionSmall datasetsSimulation requirementsCorrelated featuresWorld dataCustomizable optionsReal-world complexitySynthetic patientsNew algorithmDatasetGeneration processLearningAlgorithmData simulation techniquesLearning effectGeneralizable framework
2021
Calibrating predictive model estimates in a distributed network of patient data
Huang Y, Jiang X, Gabriel R, Ohno-Machado L. Calibrating predictive model estimates in a distributed network of patient data. Journal Of Biomedical Informatics 2021, 117: 103758. PMID: 33811986, DOI: 10.1016/j.jbi.2021.103758.Peer-Reviewed Original ResearchConceptsData privacyRecalibration modelHigh-performance predictive modelsIntegration of dataPatient dataPredictive model estimatesDistributed networkExpected calibration errorMaximum calibration errorPrivacyClinical informaticsCalibration errorsComputational efficiencyPredictive analysisAlgorithmBuilding modelsModel buildingImportant issuePerformance measuresPredictive modelMultiple health systemsLarge numberIsotonic regressionInformaticsSystem
2020
Efficient determination of equivalence for encrypted data
Doctor J, Vaidya J, Jiang X, Wang S, Schilling L, Ong T, Matheny M, Ohno-Machado L, Meeker D. Efficient determination of equivalence for encrypted data. Computers & Security 2020, 97: 101939. PMID: 33223585, PMCID: PMC7676425, DOI: 10.1016/j.cose.2020.101939.Peer-Reviewed Original Research
2016
Consensus Statement on Electronic Health Predictive Analytics: A Guiding Framework to Address Challenges
Amarasingham R, Audet A, Bates D, Glenn Cohen I, Entwistle M, Escobar G, Liu V, Etheredge L, Lo B, Ohno-Machado L, Ram S, Saria S, Schilling L, Shahi A, Stewart W, Steyerberg E, Xie B. Consensus Statement on Electronic Health Predictive Analytics: A Guiding Framework to Address Challenges. Healthcare 2016, 4: 1163. PMID: 27141516, PMCID: PMC4837887, DOI: 10.13063/2327-9214.1163.Commentaries, Editorials and LettersPredictive analytics applicationsAnalytics applicationsPredictive analyticsData sharingData barriersPredictive model developmentElectronic health record dataCertification frameworkReal timeEfficient mannerModel developmentAvailable electronic health record dataHealth record dataList of recommendationsSystematic frameworkAlgorithmDiverse expertiseRecent explosionEarlier frameworkFramework
2015
VERTIcal Grid lOgistic regression (VERTIGO)
Li Y, Jiang X, Wang S, Xiong H, Ohno-Machado L. VERTIcal Grid lOgistic regression (VERTIGO). Journal Of The American Medical Informatics Association 2015, 23: 570-579. PMID: 26554428, PMCID: PMC4901373, DOI: 10.1093/jamia/ocv146.Peer-Reviewed Original ResearchConceptsFederated data analysisReal-world medical classification problemsMedical classification problemsLogistic regression algorithmAccurate global modelData setsReal data setsClassification problemExchange of informationLR problemTime complexityComputational complexityExpensive operationRegression algorithmComputational costData analysisAlgorithmDual optimizationTechnical challengesLarge amountComplexityPatient recordsLR modelNovel techniqueHessian matrix
2014
Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery
Zhao Y, Wang X, Jiang X, Ohno-Machado L, Tang H. Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery. Journal Of The American Medical Informatics Association 2014, 22: 100-108. PMID: 25352565, PMCID: PMC4433380, DOI: 10.1136/amiajnl-2014-003043.Peer-Reviewed Original ResearchConceptsData ownersData usersHuman genomic datasetsHuman genomic dataPatient privacyPrivacyGeneration approachUsersData selectionReal dataDatasetGenomic datasetsPrivate solicitationDNA datasetsScientific discoveryNew approachGenomic dataHigh confidencePilot versionEvaluation methodRight choiceOwnersAlgorithmNew techniqueDisease marker discoveryBig Data In Health Care: Using Analytics To Identify And Manage High-Risk And High-Cost Patients
Bates D, Saria S, Ohno-Machado L, Shah A, Escobar G. Big Data In Health Care: Using Analytics To Identify And Manage High-Risk And High-Cost Patients. Health Affairs 2014, 33: 1123-1131. PMID: 25006137, DOI: 10.1377/hlthaff.2014.0041.Commentaries, Editorials and LettersConceptsBig dataClinical analyticsPrivacy concernsUse casesElectronic health recordsAnalyticsTypes of dataHealth recordsTypes of insightsNecessary analysisSupport of researchHigh-cost patientsUnprecedented opportunityMonitoring devicesCostHealth careAlgorithmMultiple organ systemsRapid progressInfrastructureUS health care systemHealth care systemSystemAdverse eventsClinical dataDifferentially private distributed logistic regression using private and public data
Ji Z, Jiang X, Wang S, Xiong L, Ohno-Machado L. Differentially private distributed logistic regression using private and public data. BMC Medical Genomics 2014, 7: s14. PMID: 25079786, PMCID: PMC4101668, DOI: 10.1186/1755-8794-7-s1-s14.Peer-Reviewed Original ResearchConceptsPrivate dataDifferential privacyPublic datasetsPublic dataRigorous privacy guaranteeData privacy researchPrivate data setsData mining modelsData setsProvable privacyPrivacy guaranteesMining modelPrivacy researchDifferent data setsArt frameworksMedical informaticsPrivacyAmount of noisePrivate methodsAuxiliary informationBetter utilityNew algorithmUpdate stepAvailable public dataAlgorithmHUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads
Li P, Jiang X, Wang S, Kim J, Xiong H, Ohno-Machado L. HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads. Journal Of The American Medical Informatics Association 2014, 21: 363-373. PMID: 24368726, PMCID: PMC3932469, DOI: 10.1136/amiajnl-2013-002147.Peer-Reviewed Original ResearchConceptsBase quality valuesCompression algorithmStorage savingsGenome compressionSequence Alignment/Map (SAM) formatCompression ratioNovel compression algorithmComparable compression ratioCompression mechanismK-means clusteringDifferent reference genomesQuality valuesDecompression qualityLossless compressionExecution timeCompression rateAligned readsMap formatAlgorithmBiomedical communityDifferent quality valuesExperimental datasetsAdaptive schemeStorage capabilityArchiving
2013
DNA-COMPACT: DNA COMpression Based on a Pattern-Aware Contextual Modeling Technique
Li P, Wang S, Kim J, Xiong H, Ohno-Machado L, Jiang X. DNA-COMPACT: DNA COMpression Based on a Pattern-Aware Contextual Modeling Technique. PLOS ONE 2013, 8: e80377. PMID: 24282536, PMCID: PMC3840021, DOI: 10.1371/journal.pone.0080377.Peer-Reviewed Original ResearchConceptsReference-free compressionDisk storage capacityCompression algorithmDecompression costData transferringArt algorithmsCompression performanceFile sizeGenome compressionCompression rateBit rateAlgorithmDNA compressionBiomedical researchersPerformance advantagesGenome dataModeling techniquesContextual modelImportant concernResearch purposesCompressionPerformanceStorage capacityBitsReference sequenceNatural language processing: algorithms and tools to extract computable information from EHRs and from the biomedical literature
Ohno-Machado L, Nadkarni P, Johnson K. Natural language processing: algorithms and tools to extract computable information from EHRs and from the biomedical literature. Journal Of The American Medical Informatics Association 2013, 20: 805-805. PMID: 23935077, PMCID: PMC3756279, DOI: 10.1136/amiajnl-2013-002214.Commentaries, Editorials and LettersBiomedical CyberInfrastructure challenges
Farcas C, Balac N, Ohno-Machado L. Biomedical CyberInfrastructure challenges. 2013, 1-4. DOI: 10.1145/2484762.2484767.Peer-Reviewed Original ResearchGenomes in the cloud: balancing privacy rights and the public good.
Ohno-Machado L, Farcas C, Kim J, Wang S, Jiang X. Genomes in the cloud: balancing privacy rights and the public good. AMIA Joint Summits On Translational Science Proceedings 2013, 2013: 128. PMID: 24303320.Peer-Reviewed Original Research
2012
Privacy-preserving heterogeneous health data sharing
Mohammed N, Jiang X, Chen R, Fung B, Ohno-Machado L. Privacy-preserving heterogeneous health data sharing. Journal Of The American Medical Informatics Association 2012, 20: 462-469. PMID: 23242630, PMCID: PMC3628047, DOI: 10.1136/amiajnl-2012-001027.Peer-Reviewed Original ResearchConceptsSet-valued dataDifferential privacyNoise additionPrivacy-preserving mannerAdversary's background knowledgeStrong privacy guaranteesBackground knowledgeHealth data sharingPrivacy modelPrivacy guaranteesSensitive dataData sharingHealthcare dataPrivate mannerAlgorithm designPrivacyRaw dataSynthetic dataAlgorithmHealth dataProbabilistic wayDiscriminative analysisExperimental resultsUseful informationClassification analysisiDASH: integrating data for analysis, anonymization, and sharing
Ohno-Machado L, Bafna V, Boxwala A, Chapman B, Chapman W, Chaudhuri K, Day M, Farcas C, Heintzman N, Jiang X, Kim H, Kim J, Matheny M, Resnic F, Vinterbo S, team A. iDASH: integrating data for analysis, anonymization, and sharing. Journal Of The American Medical Informatics Association 2012, 19: 196-201. PMID: 22081224, PMCID: PMC3277627, DOI: 10.1136/amiajnl-2011-000538.Commentaries, Editorials and LettersConceptsHigh-performance computing environmentPrivacy-preserving mannerCollaborative tool developmentData-sharing capabilitiesData ownersComputing environmentData consumersBiomedical computingHealth Insurance PortabilityTechnology researchTool developmentAccountability ActBiological projectsBiological dataInsurance PortabilityAnonymizationComputingPortabilityBehavioral researchersAlgorithmSoftwareCloudNew National CenterDataCapability
2011
AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm
Kim J, Patel K, Jung H, Kuo W, Ohno-Machado L. AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm. BMC Bioinformatics 2011, 12: 75. PMID: 21410990, PMCID: PMC3076267, DOI: 10.1186/1471-2105-12-75.Peer-Reviewed Original Research
2008
Optimizing logistic regression coefficients for discrimination and calibration using estimation of distribution algorithms
Robles V, Bielza C, Larrañaga P, González S, Ohno-Machado L. Optimizing logistic regression coefficients for discrimination and calibration using estimation of distribution algorithms. TOP 2008, 16: 345. DOI: 10.1007/s11750-008-0054-3.Peer-Reviewed Original Research
2007
Effects of SVM parameter optimization on discrimination and calibration for post-procedural PCI mortality
Matheny M, Resnic F, Arora N, Ohno-Machado L. Effects of SVM parameter optimization on discrimination and calibration for post-procedural PCI mortality. Journal Of Biomedical Informatics 2007, 40: 688-697. PMID: 17600771, PMCID: PMC2170520, DOI: 10.1016/j.jbi.2007.05.008.Peer-Reviewed Original ResearchConceptsSupport vector machineRadial Basis Kernel Support Vector MachineKernel support vector machineCross-entropy errorSVM parameter optimizationUnseen test dataSVM kernel typesTraining dataVector machineEvolutionary algorithmGrid searchMean squared errorKernel typeMachineOptimization methodPrediction modelNumber of methodsParameter optimizationTest dataMedical applicationsOptimization parametersMortality prediction modelAlgorithmBest modelApplications
2006
Approximation properties of haplotype tagging
Vinterbo S, Dreiseitl S, Ohno-Machado L. Approximation properties of haplotype tagging. BMC Bioinformatics 2006, 7: 8. PMID: 16401341, PMCID: PMC1395335, DOI: 10.1186/1471-2105-7-8.Peer-Reviewed Original ResearchConceptsApproximation propertiesCombinatorial optimization problemsOptimization problemImplementable algorithmComputational effortSolution qualityTerms of complexitySimple algorithmSize m.Population membersSingle processor machineAlgorithmProblemAsymptoticsApproximationProcessor machineHaplotype taggingNPsUnique identification
2005
Representation in stochastic search for phylogenetic tree reconstruction
Weber G, Ohno-Machado L, Shieber S. Representation in stochastic search for phylogenetic tree reconstruction. Journal Of Biomedical Informatics 2005, 39: 43-50. PMID: 16359929, DOI: 10.1016/j.jbi.2005.11.001.Peer-Reviewed Original Research