Supervised Learning for Detection of Duplicates in Genomic Sequence Databases
Chen Q, Zobel J, Zhang X, Verspoor K. Supervised Learning for Detection of Duplicates in Genomic Sequence Databases. PLOS ONE 2016, 11: e0159644. PMID: 27489953, PMCID: PMC4973881, DOI: 10.1371/journal.pone.0159644.Peer-Reviewed Original ResearchConceptsMulti-class modelSupervised learningMachine learningDe-duplicationGenome sequence databaseDetect duplicatesDuplicate detection methodsAutomatic systemAmount of dataDetection of duplicatesSequence databasesAblation studiesDetection contextMeta-dataDetection methodDatabase recordsExpert curationBiological databasesLearningCross-validationRecord featuresBinary modelSequence identityMachineDatabase
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply