2024
Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph
Cheng H, Asri M, Lucas J, Koren S, Li H. Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph. Nature Methods 2024, 21: 967-970. PMID: 38730258, PMCID: PMC11214949, DOI: 10.1038/s41592-024-02269-8.Peer-Reviewed Original Research
2023
De novo reconstruction of satellite repeat units from sequence data
Zhang Y, Chu J, Cheng H, Li H. De novo reconstruction of satellite repeat units from sequence data. Genome Research 2023, 33: 1994-2001. PMID: 37918962, PMCID: PMC10760446, DOI: 10.1101/gr.278005.123.Peer-Reviewed Original ResearchConceptsSatellite repeat unitSequence dataSatellite repeatsLong tandem repeated sequencesReal sequencing dataSatellite DNA evolutionTandem repeat sequencesDe novo reconstructionRepeat unitsGenomic contentGenome sequenceSatellite DNADNA evolutionModel organismsGenomeComplete assemblySequenceRepeatsCentromereAssemblyDNASpeciesAnnotation
2022
Semi-automated assembly of high-quality diploid human reference genomes
Jarvis E, Formenti G, Rhie A, Guarracino A, Yang C, Wood J, Tracey A, Thibaud-Nissen F, Vollger M, Porubsky D, Cheng H, Asri M, Logsdon G, Carnevali P, Chaisson M, Chin C, Cody S, Collins J, Ebert P, Escalona M, Fedrigo O, Fulton R, Fulton L, Garg S, Gerton J, Ghurye J, Granat A, Green R, Harvey W, Hasenfeld P, Hastie A, Haukness M, Jaeger E, Jain M, Kirsche M, Kolmogorov M, Korbel J, Koren S, Korlach J, Lee J, Li D, Lindsay T, Lucas J, Luo F, Marschall T, Mitchell M, McDaniel J, Nie F, Olsen H, Olson N, Pesout T, Potapova T, Puiu D, Regier A, Ruan J, Salzberg S, Sanders A, Schatz M, Schmitt A, Schneider V, Selvaraj S, Shafin K, Shumate A, Stitziel N, Stober C, Torrance J, Wagner J, Wang J, Wenger A, Xiao C, Zimin A, Zhang G, Wang T, Li H, Garrison E, Haussler D, Hall I, Zook J, Eichler E, Phillippy A, Paten B, Howe K, Miga K. Semi-automated assembly of high-quality diploid human reference genomes. Nature 2022, 611: 519-531. PMID: 36261518, PMCID: PMC9668749, DOI: 10.1038/s41586-022-05325-5.Peer-Reviewed Original ResearchConceptsDiploid genome assemblyGenome assemblyProtein-coding genesGlobal genetic variationCurrent human reference genomeDiploid human genomeHigh-quality assemblyAccurate long readsNon-synonymous amino acid changesHuman reference genomeAmino acid changesMost chromosomesReference assemblyReference genomeHuman genomeCentromeric regionsGenetic variationHigh diversityGenome sequencingLong readsSingle nucleotideGenomeAcid changesManual curationBiological genomesMetagenome assembly of high-fidelity long reads with hifiasm-meta
Feng X, Cheng H, Portik D, Li H. Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nature Methods 2022, 19: 671-674. PMID: 35534630, PMCID: PMC9343089, DOI: 10.1038/s41592-022-01478-3.Peer-Reviewed Original ResearchThe complete sequence of a human genome
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A, Aganezov S, Hoyt SJ, Diekhans M, Logsdon GA, Alonge M, Antonarakis SE, Borchers M, Bouffard GG, Brooks SY, Caldas GV, Chen NC, Cheng H, Chin CS, Chow W, de Lima LG, Dishuck PC, Durbin R, Dvorkina T, Fiddes IT, Formenti G, Fulton RS, Fungtammasan A, Garrison E, Grady PGS, Graves-Lindsay TA, Hall IM, Hansen NF, Hartley GA, Haukness M, Howe K, Hunkapiller MW, Jain C, Jain M, Jarvis ED, Kerpedjiev P, Kirsche M, Kolmogorov M, Korlach J, Kremitzki M, Li H, Maduro VV, Marschall T, McCartney AM, McDaniel J, Miller DE, Mullikin JC, Myers EW, Olson ND, Paten B, Peluso P, Pevzner PA, Porubsky D, Potapova T, Rogaev EI, Rosenfeld JA, Salzberg SL, Schneider VA, Sedlazeck FJ, Shafin K, Shew CJ, Shumate A, Sims Y, Smit AFA, Soto DC, Sović I, Storer JM, Streets A, Sullivan BA, Thibaud-Nissen F, Torrance J, Wagner J, Walenz BP, Wenger A, Wood JMD, Xiao C, Yan SM, Young AC, Zarate S, Surti U, McCoy RC, Dennis MY, Alexandrov IA, Gerton JL, O’Neill R, Timp W, Zook JM, Schatz MC, Eichler EE, Miga KH, Phillippy AM. The complete sequence of a human genome. Science 2022, 376: 44-53. PMID: 35357919, PMCID: PMC9186530, DOI: 10.1126/science.abj6987.Peer-Reviewed Original ResearchConceptsHuman genomeRecent segmental duplicationsHuman reference genomeProtein codingSegmental duplicationsGapless assemblyHeterochromatic regionsReference genomeGene predictionSatellite arraysComplete sequenceGenomeAcrocentric chromosomesPair sequenceBase pairsShort armFunctional studiesChromosomesSequenceComplex regionTelomeresDuplicationRegionAssemblyConsortiumHaplotype-resolved assembly of diploid genomes without parental data
Cheng H, Jarvis E, Fedrigo O, Koepfli K, Urban L, Gemmell N, Li H. Haplotype-resolved assembly of diploid genomes without parental data. Nature Biotechnology 2022, 40: 1332-1335. PMID: 35332338, PMCID: PMC9464699, DOI: 10.1038/s41587-022-01261-x.Peer-Reviewed Original ResearchCurated variation benchmarks for challenging medically relevant autosomal genes
Wagner J, Olson N, Harris L, McDaniel J, Cheng H, Fungtammasan A, Hwang Y, Gupta R, Wenger A, Rowell W, Khan Z, Farek J, Zhu Y, Pisupati A, Mahmoud M, Xiao C, Yoo B, Sahraeian S, Miller D, Jáspez D, Lorenzo-Salazar J, Muñoz-Barrera A, Rubio-Rodríguez L, Flores C, Narzisi G, Evani U, Clarke W, Lee J, Mason C, Lincoln S, Miga K, Ebbert M, Shumate A, Li H, Chin C, Zook J, Sedlazeck F. Curated variation benchmarks for challenging medically relevant autosomal genes. Nature Biotechnology 2022, 40: 672-680. PMID: 35132260, PMCID: PMC9117392, DOI: 10.1038/s41587-021-01158-1.Peer-Reviewed Original ResearchConceptsWhole-genome assemblyRelevant genesAutosomal genesLong-read technologiesSingle-nucleotide variationsVariant recallBottle ConsortiumWhole genomeSingle-nucleotidePolymorphic complexFalse duplicationsGenesGRCh38GRCh37GenomeStructural variationsRepetitive natureDuplicationAssemblyDeletionCRYAAVariantsClinical settingCBSComplex
2021
Real-time mapping of nanopore raw signals
Zhang H, Li H, Jain C, Cheng H, Au K, Li H, Aluru S. Real-time mapping of nanopore raw signals. Bioinformatics 2021, 37: i477-i483. PMID: 34252938, PMCID: PMC8336444, DOI: 10.1093/bioinformatics/btab264.Peer-Reviewed Original ResearchConceptsRaw signalsBase-calling procedureLibrary preparation protocolState-of-the-artK-d treeSeed selection strategySupplementary dataGreen algaeTarget sequenceStreaming methodSequencing devicesGenomeSignal spaceAdapter sequencesSelection strategyChain algorithmSequenceBioinformaticsReal timeMapping accuracySignal characteristicsMapping methodRead signalSeedYeastHaplotype-resolved de novo assembly using phased assembly graphs with hifiasm
Cheng H, Concepcion G, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods 2021, 18: 170-175. PMID: 33526886, PMCID: PMC7961889, DOI: 10.1038/s41592-020-01056-5.Peer-Reviewed Original ResearchConceptsHaplotype-resolved de novo assembliesAssembly graphStudy of sequence variationHaplotype-resolved assembliesDe novo assemblyGraph-based assemblersConsensus copyHaplotype informationSequence readsHexaploid genomeTrio binningSequence variationHifiasmHeterozygous allelesHaplotypesGenomeCalifornia redwoodBinning algorithmAssemblyHexaploidAllelesCopy
2020
Chromosome-scale, haplotype-resolved assembly of human genomes
Garg S, Fungtammasan A, Carroll A, Chou M, Schmitt A, Zhou X, Mac S, Peluso P, Hatas E, Ghurye J, Maguire J, Mahmoud M, Cheng H, Heller D, Zook J, Moemke T, Marschall T, Sedlazeck F, Aach J, Chin C, Church G, Li H. Chromosome-scale, haplotype-resolved assembly of human genomes. Nature Biotechnology 2020, 39: 309-312. PMID: 33288905, PMCID: PMC7954703, DOI: 10.1038/s41587-020-0711-0.Peer-Reviewed Original ResearchConceptsHaplotype-resolved assembliesHuman genomeStructural variantsAssembly of human genomesDiscovery of structural variantsChromosome-scale phasingComplex genetic variationKiller cell immunoglobulin-like receptorsChromosome-scaleDiploid assemblyHaplotype-resolvedContig lengthGenome assemblyHeterozygous sitesTransposon insertionHaplotype variationGenetic variationPedigree informationGenomePhase assemblyPrecision medicineHuman leukocyte antigenImmunoglobulin-like receptorsAssemblyImportant regions
2017
An Efficient Filtration Method Based on Variable-Length Seeds for Sequence Alignment
Guo R, Cheng H, Xu Y. An Efficient Filtration Method Based on Variable-Length Seeds for Sequence Alignment. Communications In Computer And Information Science 2017, 729: 214-223. DOI: 10.1007/978-981-10-6442-5_19.Peer-Reviewed Original Research
2015
BitMapper: an efficient all-mapper based on bit-vector computing
Cheng H, Jiang H, Yang J, Xu Y, Shang Y. BitMapper: an efficient all-mapper based on bit-vector computing. BMC Bioinformatics 2015, 16: 192. PMID: 26063651, PMCID: PMC4462005, DOI: 10.1186/s12859-015-0626-9.Peer-Reviewed Original ResearchConceptsNext-generation sequencingMapping next-generation sequencingState-of-the-art all-mappersState-of-the-artReference genomeRaw readsBit-vector algorithmMap locationBit-vectorGPL licenseEdit distanceVerification timeGenomeRunning timeData setsComputational challengesExperimental resultsIndelsHttp://homeVerificationSequenceAlgorithmMultiple locations
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply