2021
From Tokenization to Self-Supervision: Building a High-Performance Information Extraction System for Chemical Reactions in Patents
Wang J, Ren Y, Zhang Z, Xu H, Zhang Y. From Tokenization to Self-Supervision: Building a High-Performance Information Extraction System for Chemical Reactions in Patents. Frontiers In Research Metrics And Analytics 2021, 6: 691105. PMID: 35005421, PMCID: PMC8727901, DOI: 10.3389/frma.2021.691105.Peer-Reviewed Original ResearchEvent extractionEntity recognitionNatural language processing techniquesAccurate information extractionInformation extraction systemLanguage processing techniquesKnowledge-based rulesInformation extractionAutomatic toolEnd systemArt resultsSemantic rolesLanguage modelSelf-SupervisionFree textChemical patentsSubtask 1Reaction extractionDifferent semantic rolesHybrid approachEvent triggersProcessing techniquesSubtasksTokenizationHigh performance
2015
Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature
Chen G, Zhao J, Cohen T, Tao C, Sun J, Xu H, Bernstam E, Lawson A, Zeng J, Johnson A, Holla V, Bailey A, Lara-Guerra H, Litzenburger B, Meric-Bernstam F, Zheng W. Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature. Database 2015, 2015: bav034. PMID: 25858285, PMCID: PMC4390608, DOI: 10.1093/database/bav034.Peer-Reviewed Original ResearchConceptsBiomedical literatureOntology FingerprintsAccurate information extractionAmbiguous gene namesGene namesMapReduce frameworkBig dataInformation extractionName entitiesCore algorithm