2024
Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph
Cheng H, Asri M, Lucas J, Koren S, Li H. Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph. Nature Methods 2024, 21: 967-970. PMID: 38730258, PMCID: PMC11214949, DOI: 10.1038/s41592-024-02269-8.Peer-Reviewed Original ResearchMeSH KeywordsAlgorithmsDiploidyGenome, HumanGenome, PlantHigh-Throughput Nucleotide SequencingHumansPolyploidySequence Analysis, DNATelomere
2023
De novo reconstruction of satellite repeat units from sequence data
Zhang Y, Chu J, Cheng H, Li H. De novo reconstruction of satellite repeat units from sequence data. Genome Research 2023, 33: 1994-2001. PMID: 37918962, PMCID: PMC10760446, DOI: 10.1101/gr.278005.123.Peer-Reviewed Original ResearchMeSH KeywordsAlgorithmsAnimalsDNA, SatelliteHumansRepetitive Sequences, Nucleic AcidSequence Analysis, DNAConceptsSatellite repeat unitSequence dataSatellite repeatsLong tandem repeated sequencesReal sequencing dataSatellite DNA evolutionTandem repeat sequencesDe novo reconstructionRepeat unitsGenomic contentGenome sequenceSatellite DNADNA evolutionModel organismsGenomeComplete assemblySequenceRepeatsCentromereAssemblyDNASpeciesAnnotationA draft human pangenome reference
Liao W, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas J, Monlong J, Abel H, Buonaiuto S, Chang X, Cheng H, Chu J, Colonna V, Eizenga J, Feng X, Fischer C, Fulton R, Garg S, Groza C, Guarracino A, Harvey W, Heumos S, Howe K, Jain M, Lu T, Markello C, Martin F, Mitchell M, Munson K, Mwaniki M, Novak A, Olsen H, Pesout T, Porubsky D, Prins P, Sibbesen J, Sirén J, Tomlinson C, Villani F, Vollger M, Antonacci-Fulton L, Baid G, Baker C, Belyaeva A, Billis K, Carroll A, Chang P, Cody S, Cook D, Cook-Deegan R, Cornejo O, Diekhans M, Ebert P, Fairley S, Fedrigo O, Felsenfeld A, Formenti G, Frankish A, Gao Y, Garrison N, Giron C, Green R, Haggerty L, Hoekzema K, Hourlier T, Ji H, Kenny E, Koenig B, Kolesnikov A, Korbel J, Kordosky J, Koren S, Lee H, Lewis A, Magalhães H, Marco-Sola S, Marijon P, McCartney A, McDaniel J, Mountcastle J, Nattestad M, Nurk S, Olson N, Popejoy A, Puiu D, Rautiainen M, Regier A, Rhie A, Sacco S, Sanders A, Schneider V, Schultz B, Shafin K, Smith M, Sofia H, Abou Tayoun A, Thibaud-Nissen F, Tricomi F, Wagner J, Walenz B, Wood J, Zimin A, Bourque G, Chaisson M, Flicek P, Phillippy A, Zook J, Eichler E, Haussler D, Wang T, Jarvis E, Miga K, Garrison E, Marschall T, Hall I, Li H, Paten B. A draft human pangenome reference. Nature 2023, 617: 312-324. PMID: 37165242, PMCID: PMC10172123, DOI: 10.1038/s41586-023-05896-x.Peer-Reviewed Original ResearchGaps and complex structurally variant loci in phased genome assemblies
Porubsky D, Vollger M, Harvey W, Rozanski A, Ebert P, Hickey G, Hasenfeld P, Sanders A, Stober C, Consortium H, Korbel J, Paten B, Marschall T, Eichler E, Abel H, Antonacci-Fulton L, Asri M, Baid G, Baker C, Belyaeva A, Billis K, Bourque G, Buonaiuto S, Carroll A, Chaisson M, Chang P, Chang X, Cheng H, Chu J, Cody S, Colonna V, Cook D, Cook-Deegan R, Cornejo O, Diekhans M, Doerr D, Ebert P, Ebler J, Eichler E, Eizenga J, Fairley S, Fedrigo O, Felsenfeld A, Feng X, Fischer C, Flicek P, Formenti G, Frankish A, Fulton R, Gao Y, Garg S, Garrison E, Garrison N, Giron C, Green R, Groza C, Guarracino A, Haggerty L, Hall I, Harvey W, Haukness M, Haussler D, Heumos S, Hickey G, Hoekzema K, Hourlier T, Howe K, Jain M, Jarvis E, Ji H, Kenny E, Koenig B, Kolesnikov A, Korbel J, Kordosky J, Koren S, Lee H, Lewis A, Li H, Liao W, Lu S, Lu T, Lucas J, Magalhães H, Marco-Sola S, Marijon P, Markello C, Marschall T, Martin F, McCartney A, McDaniel J, Miga K, Mitchell M, Monlong J, Mountcastle J, Munson K, Mwaniki M, Nattestad M, Novak A, Nurk S, Olsen H, Olson N, Paten B, Pesout T, Phillippy A, Popejoy A, Porubsky D, Prins P, Puiu D, Rautiainen M, Regier A, Rhie A, Sacco S, Sanders A, Schneider V, Schultz B, Shafin K, Sibbesen J, Sirén J, Smith M, Sofia H, Tayoun A, Thibaud-Nissen F, Tomlinson C, Tricomi F, Villani F, Vollger M, Wagner J, Walenz B, Wang T, Wood J, Zimin A, Zook J. Gaps and complex structurally variant loci in phased genome assemblies. Genome Research 2023, 33: 496-510. PMID: 37164484, PMCID: PMC10234299, DOI: 10.1101/gr.277334.122.Peer-Reviewed Original ResearchMeSH KeywordsDNA, SatelliteHaplotypesHumansPolymorphism, GeneticSegmental Duplications, GenomicSequence Analysis, DNAConceptsProtein-coding genesGenome assemblyMbp of DNALinked-read dataLarge segmental duplicationsStrand-seqDiversity panelInversion polymorphismHaploid genomeSegmental duplicationsEuchromatic DNAMore haplotypesIdentical repeatsHaploid assembliesVariant lociDNAHaplotypesGenesFrequent expansionAssembly gapsImportant targetAssemblyHuman speciesHuman samplesMBP
2022
Semi-automated assembly of high-quality diploid human reference genomes
Jarvis E, Formenti G, Rhie A, Guarracino A, Yang C, Wood J, Tracey A, Thibaud-Nissen F, Vollger M, Porubsky D, Cheng H, Asri M, Logsdon G, Carnevali P, Chaisson M, Chin C, Cody S, Collins J, Ebert P, Escalona M, Fedrigo O, Fulton R, Fulton L, Garg S, Gerton J, Ghurye J, Granat A, Green R, Harvey W, Hasenfeld P, Hastie A, Haukness M, Jaeger E, Jain M, Kirsche M, Kolmogorov M, Korbel J, Koren S, Korlach J, Lee J, Li D, Lindsay T, Lucas J, Luo F, Marschall T, Mitchell M, McDaniel J, Nie F, Olsen H, Olson N, Pesout T, Potapova T, Puiu D, Regier A, Ruan J, Salzberg S, Sanders A, Schatz M, Schmitt A, Schneider V, Selvaraj S, Shafin K, Shumate A, Stitziel N, Stober C, Torrance J, Wagner J, Wang J, Wenger A, Xiao C, Zimin A, Zhang G, Wang T, Li H, Garrison E, Haussler D, Hall I, Zook J, Eichler E, Phillippy A, Paten B, Howe K, Miga K. Semi-automated assembly of high-quality diploid human reference genomes. Nature 2022, 611: 519-531. PMID: 36261518, PMCID: PMC9668749, DOI: 10.1038/s41586-022-05325-5.Peer-Reviewed Original ResearchConceptsDiploid genome assemblyGenome assemblyProtein-coding genesGlobal genetic variationCurrent human reference genomeDiploid human genomeHigh-quality assemblyAccurate long readsNon-synonymous amino acid changesHuman reference genomeAmino acid changesMost chromosomesReference assemblyReference genomeHuman genomeCentromeric regionsGenetic variationHigh diversityGenome sequencingLong readsSingle nucleotideGenomeAcid changesManual curationBiological genomesMetagenome assembly of high-fidelity long reads with hifiasm-meta
Feng X, Cheng H, Portik D, Li H. Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nature Methods 2022, 19: 671-674. PMID: 35534630, PMCID: PMC9343089, DOI: 10.1038/s41592-022-01478-3.Peer-Reviewed Original ResearchMeSH KeywordsGenome, BacterialHigh-Throughput Nucleotide SequencingMetagenomeMicrobiotaSequence Analysis, DNASoftwareThe complete sequence of a human genome
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A, Aganezov S, Hoyt SJ, Diekhans M, Logsdon GA, Alonge M, Antonarakis SE, Borchers M, Bouffard GG, Brooks SY, Caldas GV, Chen NC, Cheng H, Chin CS, Chow W, de Lima LG, Dishuck PC, Durbin R, Dvorkina T, Fiddes IT, Formenti G, Fulton RS, Fungtammasan A, Garrison E, Grady PGS, Graves-Lindsay TA, Hall IM, Hansen NF, Hartley GA, Haukness M, Howe K, Hunkapiller MW, Jain C, Jain M, Jarvis ED, Kerpedjiev P, Kirsche M, Kolmogorov M, Korlach J, Kremitzki M, Li H, Maduro VV, Marschall T, McCartney AM, McDaniel J, Miller DE, Mullikin JC, Myers EW, Olson ND, Paten B, Peluso P, Pevzner PA, Porubsky D, Potapova T, Rogaev EI, Rosenfeld JA, Salzberg SL, Schneider VA, Sedlazeck FJ, Shafin K, Shew CJ, Shumate A, Sims Y, Smit AFA, Soto DC, Sović I, Storer JM, Streets A, Sullivan BA, Thibaud-Nissen F, Torrance J, Wagner J, Walenz BP, Wenger A, Wood JMD, Xiao C, Yan SM, Young AC, Zarate S, Surti U, McCoy RC, Dennis MY, Alexandrov IA, Gerton JL, O’Neill R, Timp W, Zook JM, Schatz MC, Eichler EE, Miga KH, Phillippy AM. The complete sequence of a human genome. Science 2022, 376: 44-53. PMID: 35357919, PMCID: PMC9186530, DOI: 10.1126/science.abj6987.Peer-Reviewed Original ResearchMeSH KeywordsCell LineChromosomes, Artificial, BacterialChromosomes, HumanGenome, HumanHuman Genome ProjectHumansReference ValuesSequence Analysis, DNAConceptsHuman genomeRecent segmental duplicationsHuman reference genomeProtein codingSegmental duplicationsGapless assemblyHeterochromatic regionsReference genomeGene predictionSatellite arraysComplete sequenceGenomeAcrocentric chromosomesPair sequenceBase pairsShort armFunctional studiesChromosomesSequenceComplex regionTelomeresDuplicationRegionAssemblyConsortiumHaplotype-resolved assembly of diploid genomes without parental data
Cheng H, Jarvis E, Fedrigo O, Koepfli K, Urban L, Gemmell N, Li H. Haplotype-resolved assembly of diploid genomes without parental data. Nature Biotechnology 2022, 40: 1332-1335. PMID: 35332338, PMCID: PMC9464699, DOI: 10.1038/s41587-022-01261-x.Peer-Reviewed Original ResearchMeSH KeywordsDiploidyGenomeHaplotypesHigh-Throughput Nucleotide SequencingHumansParentsSequence Analysis, DNACurated variation benchmarks for challenging medically relevant autosomal genes
Wagner J, Olson N, Harris L, McDaniel J, Cheng H, Fungtammasan A, Hwang Y, Gupta R, Wenger A, Rowell W, Khan Z, Farek J, Zhu Y, Pisupati A, Mahmoud M, Xiao C, Yoo B, Sahraeian S, Miller D, Jáspez D, Lorenzo-Salazar J, Muñoz-Barrera A, Rubio-Rodríguez L, Flores C, Narzisi G, Evani U, Clarke W, Lee J, Mason C, Lincoln S, Miga K, Ebbert M, Shumate A, Li H, Chin C, Zook J, Sedlazeck F. Curated variation benchmarks for challenging medically relevant autosomal genes. Nature Biotechnology 2022, 40: 672-680. PMID: 35132260, PMCID: PMC9117392, DOI: 10.1038/s41587-021-01158-1.Peer-Reviewed Original ResearchConceptsWhole-genome assemblyRelevant genesAutosomal genesLong-read technologiesSingle-nucleotide variationsVariant recallBottle ConsortiumWhole genomeSingle-nucleotidePolymorphic complexFalse duplicationsGenesGRCh38GRCh37GenomeStructural variationsRepetitive natureDuplicationAssemblyDeletionCRYAAVariantsClinical settingCBSComplex
2021
Fast alignment and preprocessing of chromatin profiles with Chromap
Zhang H, Song L, Wang X, Cheng H, Wang C, Meyer C, Liu T, Tang M, Aluru S, Yue F, Liu X, Li H. Fast alignment and preprocessing of chromatin profiles with Chromap. Nature Communications 2021, 12: 6566. PMID: 34772935, PMCID: PMC8589834, DOI: 10.1038/s41467-021-26865-w.Peer-Reviewed Original ResearchReal-time mapping of nanopore raw signals
Zhang H, Li H, Jain C, Cheng H, Au K, Li H, Aluru S. Real-time mapping of nanopore raw signals. Bioinformatics 2021, 37: i477-i483. PMID: 34252938, PMCID: PMC8336444, DOI: 10.1093/bioinformatics/btab264.Peer-Reviewed Original ResearchMeSH KeywordsAlgorithmsGenomeHigh-Throughput Nucleotide SequencingNanoporesSequence Analysis, DNASoftwareConceptsRaw signalsBase-calling procedureLibrary preparation protocolState-of-the-artK-d treeSeed selection strategySupplementary dataGreen algaeTarget sequenceStreaming methodSequencing devicesGenomeSignal spaceAdapter sequencesSelection strategyChain algorithmSequenceBioinformaticsReal timeMapping accuracySignal characteristicsMapping methodRead signalSeedYeastHaplotype-resolved de novo assembly using phased assembly graphs with hifiasm
Cheng H, Concepcion G, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods 2021, 18: 170-175. PMID: 33526886, PMCID: PMC7961889, DOI: 10.1038/s41592-020-01056-5.Peer-Reviewed Original ResearchConceptsHaplotype-resolved de novo assembliesAssembly graphStudy of sequence variationHaplotype-resolved assembliesDe novo assemblyGraph-based assemblersConsensus copyHaplotype informationSequence readsHexaploid genomeTrio binningSequence variationHifiasmHeterozygous allelesHaplotypesGenomeCalifornia redwoodBinning algorithmAssemblyHexaploidAllelesCopy
2017
FMtree: a fast locating algorithm of FM-indexes for genomic data
Cheng H, Wu M, Xu Y. FMtree: a fast locating algorithm of FM-indexes for genomic data. Bioinformatics 2017, 34: 416-424. PMID: 28968761, DOI: 10.1093/bioinformatics/btx596.Peer-Reviewed Original ResearchConceptsFull-text indexGenomic dataState-of-the-art algorithmsMultiway treeLocation algorithmLocation operationsState-of-the-artFM-indexTree-based algorithmsPosition of patternsMemory-efficientLong textData localitySupplementary dataOccurrence positionSuffix treeSuffix arrayShort patternsAlgorithmBioinformaticsExperimental resultsTreesTextOperationTask
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply