2023
The All of Us Data and Research Center: Creating a Secure, Scalable, and Sustainable Ecosystem for Biomedical Research
Mayo K, Basford M, Carroll R, Dillon M, Fullen H, Leung J, Master H, Rura S, Sulieman L, Kennedy N, Banks E, Bernick D, Gauchan A, Lichtenstein L, Mapes B, Marginean K, Nyemba S, Ramirez A, Rotundo C, Wolfe K, Xia W, Azuine R, Cronin R, Denny J, Kho A, Lunt C, Malin B, Natarajan K, Wilkins C, Xu H, Hripcsak G, Roden D, Philippakis A, Glazer D, Harris P. The All of Us Data and Research Center: Creating a Secure, Scalable, and Sustainable Ecosystem for Biomedical Research. Annual Review Of Biomedical Data Science 2023, 6: 443-464. PMID: 37561600, PMCID: PMC11157478, DOI: 10.1146/annurev-biodatasci-122120-104825.Peer-Reviewed Original ResearchA hierarchical strategy to minimize privacy risk when linking “De-identified” data in biomedical research consortia
Ohno-Machado L, Jiang X, Kuo T, Tao S, Chen L, Ram P, Zhang G, Xu H. A hierarchical strategy to minimize privacy risk when linking “De-identified” data in biomedical research consortia. Journal Of Biomedical Informatics 2023, 139: 104322. PMID: 36806328, PMCID: PMC10975485, DOI: 10.1016/j.jbi.2023.104322.Peer-Reviewed Original ResearchConceptsPrivacy of individualsAppropriate privacy protectionData-driven modelsPrivacy protectionPrivacy risksData Coordination CenterData hubData repositoryHierarchical strategyPrivacyBiomedical discoveryData setsRecord linkageData Coordinating CenterRepositoryComplex strategiesCoordination centerTechnologyTechniqueDataPartiesSetHierarchy
2021
The application of artificial intelligence and data integration in COVID-19 studies: a scoping review
Guo Y, Zhang Y, Lyu T, Prosperi M, Wang F, Xu H, Bian J. The application of artificial intelligence and data integration in COVID-19 studies: a scoping review. Journal Of The American Medical Informatics Association 2021, 28: 2050-2067. PMID: 34151987, PMCID: PMC8344463, DOI: 10.1093/jamia/ocab098.Peer-Reviewed Original ResearchConceptsAI applicationsArtificial intelligenceData integrationHeterogeneous dataSocial media data analysisMost AI applicationsHeterogeneous data sourcesMedia data analysisProteomics data analysisAI algorithmsAI frameworkElectronic health recordsHeterogenous dataBiased algorithmsHealth recordsCOVID-19 researchData analysisSingle-source approachResearch topicData sourcesResearch areaIntelligenceSurveillance systemDifferent sourcesAlgorithm
2020
Coronavirus: indexed data speed up solutions
Ohno-Machado L, Xu H. Coronavirus: indexed data speed up solutions. Nature 2020, 584: 192-192. PMID: 32782375, DOI: 10.1038/d41586-020-02331-3.Commentaries, Editorials and LettersAchievability to Extract Specific Date Information for Cancer Research.
Wang L, Wampfler J, Dispenzieri A, Xu H, Yang P, Liu H. Achievability to Extract Specific Date Information for Cancer Research. AMIA Annual Symposium Proceedings 2020, 2019: 893-902. PMID: 32308886, PMCID: PMC7153063.Peer-Reviewed Original Research
2019
Parsing clinical text using the state-of-the-art deep learning based parsers: a systematic comparison
Zhang Y, Tiryaki F, Jiang M, Xu H. Parsing clinical text using the state-of-the-art deep learning based parsers: a systematic comparison. BMC Medical Informatics And Decision Making 2019, 19: 77. PMID: 30943955, PMCID: PMC6448179, DOI: 10.1186/s12911-019-0783-2.Peer-Reviewed Original Research
2017
CNN-based ranking for biomedical entity normalization
Li H, Chen Q, Tang B, Wang X, Xu H, Wang B, Huang D. CNN-based ranking for biomedical entity normalization. BMC Bioinformatics 2017, 18: 385. PMID: 28984180, PMCID: PMC5629610, DOI: 10.1186/s12859-017-1805-7.Peer-Reviewed Original ResearchMeSH KeywordsAlgorithmsBiomedical ResearchDatabases as TopicHumansNeural Networks, ComputerReference StandardsSemanticsConceptsBiomedical entity normalizationEntity normalizationSemantic informationCNN architectureNovel convolutional neural network architectureConvolutional neural network architectureTraditional rule-based methodsNeural network architectureRule-based systemRanking methodRule-based methodNetwork architectureBiomedical entitiesBenchmark datasetsArt performanceEntity mentionsRanking problemCNNNormalization systemArchitectureMorphological informationComparison resultsInformationDatasetSystemFinding useful data across multiple biomedical data repositories using DataMed
Ohno-Machado L, Sansone S, Alter G, Fore I, Grethe J, Xu H, Gonzalez-Beltran A, Rocca-Serra P, Gururaj A, Bell E, Soysal E, Zong N, Kim H. Finding useful data across multiple biomedical data repositories using DataMed. Nature Genetics 2017, 49: 816-819. PMID: 28546571, PMCID: PMC6460922, DOI: 10.1038/ng.3864.Peer-Reviewed Original ResearchMeSH KeywordsBiological OntologiesBiomedical ResearchComputational BiologyDatabases, FactualHumansMetadataSoftwareSystems IntegrationConceptsBiomedical data repositoriesHealth big dataData setsKnowledge discoveryBig dataMultiple repositoriesSearch enginesData indexFAIR principlesDataMedData repositoryService providersKnowledge initiativesKnowledge expertsBiomedical research communityResearch communityRepositoryScience landscapeUseful dataInteroperabilityMetadataFindabilitySetEngineDataCATTLE (CAncer treatment treasury with linked evidence): An integrated knowledge base for personalized oncology research and practice
Soysal E, Lee H, Zhang Y, Huang L, Chen X, Wei Q, Zheng W, Chang J, Cohen T, Sun J, Xu H. CATTLE (CAncer treatment treasury with linked evidence): An integrated knowledge base for personalized oncology research and practice. CPT Pharmacometrics & Systems Pharmacology 2017, 6: 188-196. PMID: 28296354, PMCID: PMC5351410, DOI: 10.1002/psp4.12174.Peer-Reviewed Original ResearchMeSH KeywordsAntineoplastic AgentsBiomedical ResearchClinical Trials as TopicDatabases, FactualDrug DiscoveryHumansKnowledge BasesNeoplasmsPrecision MedicineLiterature-Based Discovery of Confounding in Observational Clinical Data.
Malec S, Wei P, Xu H, Bernstam E, Myneni S, Cohen T. Literature-Based Discovery of Confounding in Observational Clinical Data. AMIA Annual Symposium Proceedings 2017, 2016: 1920-1929. PMID: 28269951, PMCID: PMC5333204.Peer-Reviewed Original ResearchA publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge
Cohen T, Roberts K, Gururaj A, Chen X, Pournejati S, Alter G, Hersh W, Demner-Fushman D, Ohno-Machado L, Xu H. A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge. Database 2017, 2017: bax061. PMID: 29220453, PMCID: PMC5737202, DOI: 10.1093/database/bax061.Peer-Reviewed Original Research
2016
Leveraging syntactic and semantic graph kernels to extract pharmacokinetic drug drug interactions from biomedical literature
Zhang Y, Wu H, Xu J, Wang J, Soysal E, Li L, Xu H. Leveraging syntactic and semantic graph kernels to extract pharmacokinetic drug drug interactions from biomedical literature. BMC Systems Biology 2016, 10: 67. PMID: 27585838, PMCID: PMC5009562, DOI: 10.1186/s12918-016-0311-2.Peer-Reviewed Original ResearchMeSH KeywordsBiomedical ResearchComputational BiologyComputer GraphicsData MiningDrug InteractionsPharmacokineticsPublicationsSemanticsConceptsPaths graph kernelGraph kernelsSemantic classesSemantic informationBiomedical literatureShallow semantic representationsText mining techniquesBest F-scoreAutomatic DDI extractionProblem of sparsenessDependency structureSemantic graphDDI detectionKnowledge basesDDI corpusF-scoreDDI extractionSemantic representationNovel approachExperimental resultsKernelHigh precisionInformationSparsenessGraph
2015
Education, collaboration, and innovation: intelligent biology and medicine in the era of big data
Ruan J, Jin V, Huang Y, Xu H, Edwards J, Chen Y, Zhao Z. Education, collaboration, and innovation: intelligent biology and medicine in the era of big data. BMC Genomics 2015, 16: s1. PMID: 26099197, PMCID: PMC4474420, DOI: 10.1186/1471-2164-16-s7-s1.Peer-Reviewed Original Research
2014
Identifying plausible adverse drug reactions using knowledge extracted from the literature
Shang N, Xu H, Rindflesch T, Cohen T. Identifying plausible adverse drug reactions using knowledge extracted from the literature. Journal Of Biomedical Informatics 2014, 52: 293-310. PMID: 25046831, PMCID: PMC4261011, DOI: 10.1016/j.jbi.2014.07.011.Peer-Reviewed Original ResearchConceptsPredication-based Semantic IndexingReflective Random IndexingLBD methodsNatural language processing toolsBiomedical literatureDrug-adverse event associationsLanguage processing toolsSemantic indexingElectronic health recordsRandom IndexingHuman reviewVast repositoryDiscovery methodsVolume of knowledgeProcessing toolsEvaluation setHealth recordsData sourcesEvent associationsIndexingDrug-effect relationshipsRepositoryLarge volumesADR associationsReasoning pathwaysEvaluating Word Representation Features in Biomedical Named Entity Recognition Tasks
Tang B, Cao H, Wang X, Chen Q, Xu H. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks. BioMed Research International 2014, 2014: 240403. PMID: 24729964, PMCID: PMC3963372, DOI: 10.1155/2014/240403.Peer-Reviewed Original ResearchConceptsBiomedical Named Entity RecognitionWord representationsNamed Entity Recognition (NER) taskMachine learning-based approachWord representation featuresNatural language processingLearning-based approachEntity recognition taskNamed Entity RecognitionCluster-based representationJNLPBA corpusEntity recognitionBiomedical domainF-measureLanguage processingRepresentation featuresWord embeddingsRecognition taskWR algorithmDistributional representationsTaskBetter performanceAlgorithmRepresentationDifferent types
2012
Advances in systems biology: computational algorithms and applications
Huang Y, Zhao Z, Xu H, Shyr Y, Zhang B. Advances in systems biology: computational algorithms and applications. BMC Systems Biology 2012, 6: s1. PMID: 23281622, PMCID: PMC3524016, DOI: 10.1186/1752-0509-6-s3-s1.Peer-Reviewed Original Research
2007
Using contextual and lexical features to restructure and validate the classification of biomedical concepts
Fan J, Xu H, Friedman C. Using contextual and lexical features to restructure and validate the classification of biomedical concepts. BMC Bioinformatics 2007, 8: 264. PMID: 17650333, PMCID: PMC2014782, DOI: 10.1186/1471-2105-8-264.Peer-Reviewed Original ResearchMeSH KeywordsBiomedical ResearchMedical InformaticsSemanticsSoftwareTerminology as TopicUnified Medical Language SystemConceptsUnified Medical Language SystemString-based approachesMean reciprocal rankReciprocal rankNatural language processingError rateContextual featuresLexical featuresIntegration of dataLow error rateReasoning systemAutomatic approachComplementary classifiersLanguage processingClassification approachBiomedical terminologiesClassification errorOntological conceptsBiomedical conceptsOntological termsSyntactic approachLanguage systemClassifierSyntactic featuresOntology
2004
Facilitating cancer research using natural language processing of pathology reports.
Xu H, Anderson K, Grann V, Friedman C. Facilitating cancer research using natural language processing of pathology reports. 2004, 107: 565-72. PMID: 15360876.Peer-Reviewed Original Research