2019
Evaluating and sharing global genetic ancestry in biomedical datasets
Harismendy O, Kim J, Xu X, Ohno-Machado L. Evaluating and sharing global genetic ancestry in biomedical datasets. Journal Of The American Medical Informatics Association 2019, 26: 457-461. PMID: 30869786, PMCID: PMC6433181, DOI: 10.1093/jamia/ocy194.Peer-Reviewed Original ResearchConceptsGenetic diversity measurementsGenetic ancestryAvailable molecular datasetsHuman genetics researchCancer Genome Atlas (TCGA) datasetContinental resolutionGenetic diversityPhenotype-genotype associationsMolecular datasetsGlobal genetic ancestryAncestry informationGenetic researchAtlas datasetDiversity measurementsAncestryTraitsGlobal scaleDiversityBiomedical datasetsAvailable datasetsData repositoryDisease riskAccess datasetDatasetAvailable cohorts
2018
A Scalable Privacy-preserving Data Generation Methodology for Exploratory Analysis.
Vaidya J, Shafiq B, Asani M, Adam N, Jiang X, Ohno-Machado L. A Scalable Privacy-preserving Data Generation Methodology for Exploratory Analysis. AMIA Annual Symposium Proceedings 2018, 2017: 1695-1704. PMID: 29854240, PMCID: PMC5977652.Peer-Reviewed Original ResearchConceptsPrivacy-preserving approachData management systemBig dataBiomedical datasetsClassification taskBiomedical dataContext of regressionManagement systemSynthetic dataGeneration methodologyEssential problemResearch tasksAdditional datasetsDatasetTaskSignificant effortsDirect accessFirstorder approximationDataParticular typeAccessPrecision medicineDataMed – an open source discovery index for finding biomedical datasets
Chen X, Gururaj A, Ozyurt B, Liu R, Soysal E, Cohen T, Tiryaki F, Li Y, Zong N, Jiang M, Rogith D, Salimi M, Kim H, Rocca-Serra P, Gonzalez-Beltran A, Farcas C, Johnson T, Margolis R, Alter G, Sansone S, Fore I, Ohno-Machado L, Grethe J, Xu H. DataMed – an open source discovery index for finding biomedical datasets. Journal Of The American Medical Informatics Association 2018, 25: 300-308. PMID: 29346583, PMCID: PMC7378878, DOI: 10.1093/jamia/ocx121.Peer-Reviewed Original ResearchIngestion pipelineBiomedical datasetsSearch enginesBiomedical domainAdvanced natural language processingRelevant datasetsUser-entered queryData discovery systemUnified metadata modelData ingestion pipelinesNatural language processingOpen-source packageRetrieval engineTerminology servicesMetadata modelMetadata informationDiscovery systemData reuseDataMedBenchmark datasetsBiomedical dataData indexAverage precisionLanguage processingSource package
2017
Information retrieval for biomedical datasets: the 2016 bioCADDIE dataset retrieval challenge
Roberts K, Gururaj A, Chen X, Pournejati S, Hersh W, Demner-Fushman D, Ohno-Machado L, Cohen T, Xu H. Information retrieval for biomedical datasets: the 2016 bioCADDIE dataset retrieval challenge. Database 2017, 2017: bax068. DOI: 10.1093/database/bax068.Peer-Reviewed Original ResearchBiomedical datasetsRetrieval challengesInformation retrieval techniquesAdvanced query processingBiomedical data repositoriesAdvanced retrieval methodsQuery processingInformation retrievalTest queriesRetrieval systemRank frameworkRetrieval approachRetrieval techniquesData repositoryRetrieval methodTop precisionDatasetQueriesRepositoryChallengesRetrievalTaskLearningSystemCorpus