Featured Publications
A draft human pangenome reference
Liao W, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas J, Monlong J, Abel H, Buonaiuto S, Chang X, Cheng H, Chu J, Colonna V, Eizenga J, Feng X, Fischer C, Fulton R, Garg S, Groza C, Guarracino A, Harvey W, Heumos S, Howe K, Jain M, Lu T, Markello C, Martin F, Mitchell M, Munson K, Mwaniki M, Novak A, Olsen H, Pesout T, Porubsky D, Prins P, Sibbesen J, Sirén J, Tomlinson C, Villani F, Vollger M, Antonacci-Fulton L, Baid G, Baker C, Belyaeva A, Billis K, Carroll A, Chang P, Cody S, Cook D, Cook-Deegan R, Cornejo O, Diekhans M, Ebert P, Fairley S, Fedrigo O, Felsenfeld A, Formenti G, Frankish A, Gao Y, Garrison N, Giron C, Green R, Haggerty L, Hoekzema K, Hourlier T, Ji H, Kenny E, Koenig B, Kolesnikov A, Korbel J, Kordosky J, Koren S, Lee H, Lewis A, Magalhães H, Marco-Sola S, Marijon P, McCartney A, McDaniel J, Mountcastle J, Nattestad M, Nurk S, Olson N, Popejoy A, Puiu D, Rautiainen M, Regier A, Rhie A, Sacco S, Sanders A, Schneider V, Schultz B, Shafin K, Smith M, Sofia H, Abou Tayoun A, Thibaud-Nissen F, Tricomi F, Wagner J, Walenz B, Wood J, Zimin A, Bourque G, Chaisson M, Flicek P, Phillippy A, Zook J, Eichler E, Haussler D, Wang T, Jarvis E, Miga K, Garrison E, Marschall T, Hall I, Li H, Paten B. A draft human pangenome reference. Nature 2023, 617: 312-324. PMID: 37165242, PMCID: PMC10172123, DOI: 10.1038/s41586-023-05896-x.Peer-Reviewed Original ResearchThe Human Pangenome Project: a global resource to map genomic diversity
Wang T, Antonacci-Fulton L, Howe K, Lawson HA, Lucas JK, Phillippy AM, Popejoy AB, Asri M, Carson C, Chaisson MJP, Chang X, Cook-Deegan R, Felsenfeld AL, Fulton RS, Garrison EP, Garrison N, Graves-Lindsay TA, Ji H, Kenny EE, Koenig BA, Li D, Marschall T, McMichael JF, Novak AM, Purushotham D, Schneider VA, Schultz BI, Smith MW, Sofia HJ, Weissman T, Flicek P, Li H, Miga KH, Paten B, Jarvis ED, Hall IM, Eichler EE, Haussler D. The Human Pangenome Project: a global resource to map genomic diversity. Nature 2022, 604: 437-446. PMID: 35444317, PMCID: PMC9402379, DOI: 10.1038/s41586-022-04601-8.Peer-Reviewed Original ResearchConceptsHuman reference genomeReference genomeGenomic diversityGenomic variationHuman genomic variationGlobal genomic diversitySingle nucleotide variantsGene-disease associationsDiploid genomeGenetic resourcesGenomeGenomic researchFuture biomedical researchHigh-quality referenceStructural variantsHuman geneticsRoutine assemblyCommon variantsFunctional elementsPolymorphic regionDiversityBiomedical researchVariantsMajor updateGeneticsMitochondrial genome copy number measured by DNA sequencing in human blood is strongly associated with metabolic traits via cell-type composition differences
Ganel L, Chen L, Christ R, Vangipurapu J, Young E, Das I, Kanchi K, Larson D, Regier A, Abel H, Kang CJ, Scott A, Havulinna A, Chiang CWK, Service S, Freimer N, Palotie A, Ripatti S, Kuusisto J, Boehnke M, Laakso M, Locke A, Stitziel NO, Hall IM. Mitochondrial genome copy number measured by DNA sequencing in human blood is strongly associated with metabolic traits via cell-type composition differences. Human Genomics 2021, 15: 34. PMID: 34099068, PMCID: PMC8185936, DOI: 10.1186/s40246-021-00335-2.Peer-Reviewed Original ResearchMeSH KeywordsAdultAgedApoptosis Regulatory ProteinsCell LineageDNA Copy Number VariationsDNA, MitochondrialExome SequencingFemaleGenetic Predisposition to DiseaseGenome, MitochondrialGenome-Wide Association StudyGTP-Binding ProteinsHumansMaleMembrane ProteinsMendelian Randomization AnalysisMiddle AgedPhenotypePolymorphism, Single NucleotideProto-Oncogene Proteins c-mybSequence Analysis, DNAConceptsCell type compositionGenome copy numberBlood-derived DNAMitochondrial genome copy numberCombination of genomesCopy numberBulk DNA sequencingDNA sequencingPolygenic risk scoresNumber of mitochondriaExome sequencing dataRelated traitsSequencing dataMetabolic traitsTraitsCommon variantsLociRare variantsSequencingDNAFinnish individualsMendelian randomization frameworkUK BiobankMetS traitsGenomeAssociation of structural variation with cardiometabolic traits in Finns
Chen L, Abel HJ, Das I, Larson DE, Ganel L, Kanchi KL, Regier AA, Young EP, Kang CJ, Scott AJ, Chiang C, Wang X, Lu S, Christ R, Service SK, Chiang CWK, Havulinna AS, Kuusisto J, Boehnke M, Laakso M, Palotie A, Ripatti S, Freimer NB, Locke AE, Stitziel NO, Hall IM. Association of structural variation with cardiometabolic traits in Finns. American Journal Of Human Genetics 2021, 108: 583-596. PMID: 33798444, PMCID: PMC8059371, DOI: 10.1016/j.ajhg.2021.03.008.Peer-Reviewed Original ResearchMeSH KeywordsAllelesCardiovascular DiseasesCholesterolDNA Copy Number VariationsFemaleFinlandGenome, HumanGenomic Structural VariationGenotypeHigh-Throughput Nucleotide SequencingHumansMaleMitochondrial ProteinsPromoter Regions, GeneticPyruvate Dehydrogenase (Lipoamide)-PhosphatasePyruvic AcidSerum Albumin, HumanConceptsSingle nucleotide variantsCopy number variantsQuantitative traitsGenome-wide significant associationStructural variationsTrait mapping studiesDeep whole-genome sequencing dataGenome structural variationsWhole-genome sequencing dataStrong phenotypic effectsComplex genomic regionsCardiometabolic traitsLow-frequency structural variationsEvolutionary timeGenomic regionsPhenotypic effectsSequencing dataNucleotide variantsGenotype dataGene deletionNumber variantsTraitsGenetic associationCandidate associationsExome sequencingMapping and characterization of structural variation in 17,795 human genomes
Abel HJ, Larson DE, Regier AA, Chiang C, Das I, Kanchi KL, Layer RM, Neale BM, Salerno WJ, Reeves C, Buyske S, Matise T, Muzny D, Zody M, Lander E, Dutcher S, Stitziel N, Hall I. Mapping and characterization of structural variation in 17,795 human genomes. Nature 2020, 583: 83-89. PMID: 32460305, PMCID: PMC7547914, DOI: 10.1038/s41586-020-2371-0.Peer-Reviewed Original ResearchConceptsStructural variantsWhole-genome sequencingHuman genomeUltra-rare structural variantsRare structural variantsSuch structural variantsSingle nucleotide variantsNoncoding elementsDosage sensitivityGenomeHuman geneticsSmall insertionsComplex rearrangementsDeletion variantsSmall variantsStructural variationsGenesSequencingAllelesForm of variationVariantsElement classesSite frequency dataDeleterious effectsGeneticssvtools: population-scale analysis of structural variation
Larson DE, Abel HJ, Chiang C, Badve A, Das I, Eldred JM, Layer RM, Hall IM. svtools: population-scale analysis of structural variation. Bioinformatics 2019, 35: 4782-4787. PMID: 31218349, PMCID: PMC6853660, DOI: 10.1093/bioinformatics/btz492.Peer-Reviewed Original ResearchGenomic Analysis in the Age of Human Genome Sequencing
Lappalainen T, Scott AJ, Brandt M, Hall IM. Genomic Analysis in the Age of Human Genome Sequencing. Cell 2019, 177: 70-84. PMID: 30901550, PMCID: PMC6532068, DOI: 10.1016/j.cell.2019.02.032.Peer-Reviewed Original ResearchMeSH KeywordsBiological Specimen BanksChromosome MappingGenetic Predisposition to DiseaseGenetic TestingGenetic VariationGenome, HumanGenome-Wide Association StudyGenomicsHigh-Throughput Nucleotide SequencingHuman Genome ProjectHumansPolymorphism, Single NucleotideSequence Analysis, DNAWhole Genome SequencingConceptsFunctional genomics approachAllele frequency spectrumHuman genome sequencingGene mapping studiesGenome sequencing technologiesRare human diseasesWhole-genome sequencingGenomic approachesGenetic variant discoveryGenome variationHuman genomeGenome analysisGenomic analysisSequencing technologiesGenome sequencingVariant discoveryHuman diseasesHuman geneticsGenomeFunctional interpretationMapping studiesFunctional effectsSequencingGermline variantsGeneticsFunctional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects
Regier AA, Farjoun Y, Larson DE, Krasheninina O, Kang HM, Howrigan DP, Chen BJ, Kher M, Banks E, Ames DC, English AC, Li H, Xing J, Zhang Y, Matise T, Abecasis GR, Salerno W, Zody MC, Neale BM, Hall IM. Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nature Communications 2018, 9: 4038. PMID: 30279509, PMCID: PMC6168605, DOI: 10.1038/s41467-018-06159-4.Peer-Reviewed Original ResearchThe impact of structural variation on human gene expression
Chiang C, Scott AJ, Davis JR, Tsang EK, Li X, Kim Y, Hadzic T, Damani FN, Ganel L, Montgomery S, Battle A, Conrad D, Hall I. The impact of structural variation on human gene expression. Nature Genetics 2017, 49: 692-699. PMID: 28369037, PMCID: PMC5406250, DOI: 10.1038/ng.3834.Peer-Reviewed Original ResearchSVScore: an impact prediction tool for structural variation
Ganel L, Abel HJ, , Hall IM. SVScore: an impact prediction tool for structural variation. Bioinformatics 2017, 33: 1083-1085. PMID: 28031184, PMCID: PMC5408916, DOI: 10.1093/bioinformatics/btw789.Peer-Reviewed Original ResearchThe Complete Genome Sequences, Unique Mutational Spectra, and Developmental Potency of Adult Neurons Revealed by Cloning
Hazen JL, Faust GG, Rodriguez AR, Ferguson WC, Shumilina S, Clark RA, Boland MJ, Martin G, Chubukov P, Tsunemoto RK, Torkamani A, Kupriyanov S, Hall IM, Baldwin KK. The Complete Genome Sequences, Unique Mutational Spectra, and Developmental Potency of Adult Neurons Revealed by Cloning. Neuron 2016, 89: 1223-1236. PMID: 26948891, PMCID: PMC4795965, DOI: 10.1016/j.neuron.2016.02.004.Peer-Reviewed Original ResearchMeSH KeywordsAge FactorsAnimalsAnimals, NewbornCadherin Related ProteinsCadherinsCell DivisionCloning, MolecularDNA Transposable ElementsEmbryo, MammalianFemaleHumansKi-67 AntigenMiceMice, TransgenicMicrosatellite RepeatsMutationNerve Tissue ProteinsNeuronsNuclear Transfer TechniquesOlfactory BulbOocytesSequence Analysis, DNAConceptsCell type diversificationComplete genome sequenceMobile element insertionsNuclear transfer methodWhole-genome sequencingNeuronal genomeGene-disrupting mutationsNeuronal mutationsGenome sequenceUnique mutational spectrumDevelopmental potencyComprehensive mutation detectionElement insertionsGenomic mutationsRecurrent rearrangementsNovel mechanismUnique mutationsMutationsSomatic mutationsGene biasGenomeAdult neuronsMutational spectrumFertile miceMutation detectionSpeedSeq: ultra-fast personal genome analysis and interpretation
Chiang C, Layer RM, Faust GG, Lindberg MR, Rose DB, Garrison EP, Marth GT, Quinlan AR, Hall IM. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nature Methods 2015, 12: 966-968. PMID: 26258291, PMCID: PMC4589466, DOI: 10.1038/nmeth.3505.Peer-Reviewed Original ResearchLUMPY: a probabilistic framework for structural variant discovery
Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant discovery. Genome Biology 2014, 15: r84. PMID: 24970577, PMCID: PMC4197822, DOI: 10.1186/gb-2014-15-6-r84.Peer-Reviewed Original ResearchMosaic Copy Number Variation in Human Neurons
McConnell MJ, Lindberg MR, Brennand KJ, Piper JC, Voet T, Cowing-Zitron C, Shumilina S, Lasken RS, Vermeesch JR, Hall IM, Gage FH. Mosaic Copy Number Variation in Human Neurons. Science 2013, 342: 632-637. PMID: 24179226, PMCID: PMC3975283, DOI: 10.1126/science.1243472.Peer-Reviewed Original ResearchConceptsCopy number variationsHiPSC-derived neuronsSingle-cell genomic approachesNumber variationsDNA copy number variationsSingle-cell sequencingHuman neuronsLarge copy number variationsStem cell linesNeural progenitor cellsNovo copy-number variationsPluripotent stem cell lineAneuploid neuronsGenomic approachesDe novo copy-number variationsSubchromosomal copy number variationsAberrant genomesFrontal cortex neuronsLarge deletionsProgenitor cellsCell linesSubset of neuronsEuploid neuronsDeletionMultiple alterationsBreakpoint profiling of 64 cancer genomes reveals numerous complex rearrangements spawned by homology-independent mechanisms
Malhotra A, Lindberg M, Faust GG, Leibowitz ML, Clark RA, Layer RM, Quinlan AR, Hall IM. Breakpoint profiling of 64 cancer genomes reveals numerous complex rearrangements spawned by homology-independent mechanisms. Genome Research 2013, 23: 762-776. PMID: 23410887, PMCID: PMC3638133, DOI: 10.1101/gr.143677.112.Peer-Reviewed Original ResearchConceptsComplex genomic rearrangementsSingle mutational eventCancer genomesMutational eventsBreakpoint clusterDNA double-strand breaksHomology-independent mechanismsComplex rearrangementsDouble-strand breaksLarge-scale rearrangementsGenome architectureGenome rearrangementsNonhomologous repairGenomic rearrangementsChromothripsis eventsSelective advantageMore chromosomesTumor genomesGenomeGlioblastoma samplesTemplated insertionsState profilingPunctuated changeBreakpoint sequencesAllele frequenciesCharacterizing complex structural variation in germline and somatic genomes
Quinlan AR, Hall IM. Characterizing complex structural variation in germline and somatic genomes. Trends In Genetics 2011, 28: 43-53. PMID: 22094265, PMCID: PMC3249479, DOI: 10.1016/j.tig.2011.10.002.Peer-Reviewed Original ResearchConceptsComplex structural variationsStructural variationsNext-generation DNA sequencingHallmarks of cancerSomatic genomeGenetic diversityMultiple chromosomesSingle locusDistinct lociRecombination eventsComplex variantsSingle mutationMapping experimentsDNA sequencingComplicated rearrangementsMammalsCurrent knowledgeMapping studiesLociSubtle alterationsVariantsGenomeSurprising numberChromosomesGermlineGenome Sequencing of Mouse Induced Pluripotent Stem Cells Reveals Retroelement Stability and Infrequent DNA Rearrangement during Reprogramming
Quinlan AR, Boland MJ, Leibowitz ML, Shumilina S, Pehrson SM, Baldwin KK, Hall IM. Genome Sequencing of Mouse Induced Pluripotent Stem Cells Reveals Retroelement Stability and Infrequent DNA Rearrangement during Reprogramming. Cell Stem Cell 2011, 9: 366-373. PMID: 21982236, PMCID: PMC3975295, DOI: 10.1016/j.stem.2011.07.018.Peer-Reviewed Original ResearchMeSH KeywordsAnimalsBase SequenceCell LineageCellular ReprogrammingChimeraDNA Copy Number VariationsFalse Negative ReactionsGene RearrangementGene SilencingGenomeGenomic InstabilityHumansInduced Pluripotent Stem CellsMiceMolecular Sequence DataMutagenesis, InsertionalOrgan SpecificityRetroelementsSequence Analysis, DNAConceptsPluripotent stem cellsClasses of SVsPaired-end DNA sequencingStem cellsGenomic structural variationMouse Induced Pluripotent Stem CellsStructural variationsDNA copy number variationsEmbryonic stem cellsMost iPSC linesMouse iPSC linesIPSC linesInduced pluripotent stem cellsCopy number variationsGenome stabilityGene-disrupting mutationsRecent microarray studiesDNA rearrangementsGenome sequencingSpontaneous mutationsMicroarray studiesDeleterious genetic mutationsNumber variationsDNA sequencingComplex rearrangements
2023
Gaps and complex structurally variant loci in phased genome assemblies
Porubsky D, Vollger M, Harvey W, Rozanski A, Ebert P, Hickey G, Hasenfeld P, Sanders A, Stober C, Consortium H, Korbel J, Paten B, Marschall T, Eichler E, Abel H, Antonacci-Fulton L, Asri M, Baid G, Baker C, Belyaeva A, Billis K, Bourque G, Buonaiuto S, Carroll A, Chaisson M, Chang P, Chang X, Cheng H, Chu J, Cody S, Colonna V, Cook D, Cook-Deegan R, Cornejo O, Diekhans M, Doerr D, Ebert P, Ebler J, Eichler E, Eizenga J, Fairley S, Fedrigo O, Felsenfeld A, Feng X, Fischer C, Flicek P, Formenti G, Frankish A, Fulton R, Gao Y, Garg S, Garrison E, Garrison N, Giron C, Green R, Groza C, Guarracino A, Haggerty L, Hall I, Harvey W, Haukness M, Haussler D, Heumos S, Hickey G, Hoekzema K, Hourlier T, Howe K, Jain M, Jarvis E, Ji H, Kenny E, Koenig B, Kolesnikov A, Korbel J, Kordosky J, Koren S, Lee H, Lewis A, Li H, Liao W, Lu S, Lu T, Lucas J, Magalhães H, Marco-Sola S, Marijon P, Markello C, Marschall T, Martin F, McCartney A, McDaniel J, Miga K, Mitchell M, Monlong J, Mountcastle J, Munson K, Mwaniki M, Nattestad M, Novak A, Nurk S, Olsen H, Olson N, Paten B, Pesout T, Phillippy A, Popejoy A, Porubsky D, Prins P, Puiu D, Rautiainen M, Regier A, Rhie A, Sacco S, Sanders A, Schneider V, Schultz B, Shafin K, Sibbesen J, Sirén J, Smith M, Sofia H, Tayoun A, Thibaud-Nissen F, Tomlinson C, Tricomi F, Villani F, Vollger M, Wagner J, Walenz B, Wang T, Wood J, Zimin A, Zook J. Gaps and complex structurally variant loci in phased genome assemblies. Genome Research 2023, 33: 496-510. PMID: 37164484, PMCID: PMC10234299, DOI: 10.1101/gr.277334.122.Peer-Reviewed Original ResearchConceptsProtein-coding genesGenome assemblyMbp of DNALinked-read dataLarge segmental duplicationsStrand-seqDiversity panelInversion polymorphismHaploid genomeSegmental duplicationsEuchromatic DNAMore haplotypesIdentical repeatsHaploid assembliesVariant lociDNAHaplotypesGenesFrequent expansionAssembly gapsImportant targetAssemblyHuman speciesHuman samplesMBP
2022
Semi-automated assembly of high-quality diploid human reference genomes
Jarvis E, Formenti G, Rhie A, Guarracino A, Yang C, Wood J, Tracey A, Thibaud-Nissen F, Vollger M, Porubsky D, Cheng H, Asri M, Logsdon G, Carnevali P, Chaisson M, Chin C, Cody S, Collins J, Ebert P, Escalona M, Fedrigo O, Fulton R, Fulton L, Garg S, Gerton J, Ghurye J, Granat A, Green R, Harvey W, Hasenfeld P, Hastie A, Haukness M, Jaeger E, Jain M, Kirsche M, Kolmogorov M, Korbel J, Koren S, Korlach J, Lee J, Li D, Lindsay T, Lucas J, Luo F, Marschall T, Mitchell M, McDaniel J, Nie F, Olsen H, Olson N, Pesout T, Potapova T, Puiu D, Regier A, Ruan J, Salzberg S, Sanders A, Schatz M, Schmitt A, Schneider V, Selvaraj S, Shafin K, Shumate A, Stitziel N, Stober C, Torrance J, Wagner J, Wang J, Wenger A, Xiao C, Zimin A, Zhang G, Wang T, Li H, Garrison E, Haussler D, Hall I, Zook J, Eichler E, Phillippy A, Paten B, Howe K, Miga K. Semi-automated assembly of high-quality diploid human reference genomes. Nature 2022, 611: 519-531. PMID: 36261518, PMCID: PMC9668749, DOI: 10.1038/s41586-022-05325-5.Peer-Reviewed Original ResearchConceptsDiploid genome assemblyGenome assemblyProtein-coding genesGlobal genetic variationCurrent human reference genomeDiploid human genomeHigh-quality assemblyAccurate long readsNon-synonymous amino acid changesHuman reference genomeAmino acid changesMost chromosomesReference assemblyReference genomeHuman genomeCentromeric regionsGenetic variationHigh diversityGenome sequencingLong readsSingle nucleotideGenomeAcid changesManual curationBiological genomesIntegrating transcriptomics, metabolomics, and GWAS helps reveal molecular mechanisms for metabolite levels and disease risk
Yin X, Bose D, Kwon A, Hanks S, Jackson A, Stringham H, Welch R, Oravilahti A, Silva L, FinnGen, Locke A, Fuchsberger C, Service S, Erdos M, Bonnycastle L, Kuusisto J, Stitziel N, Hall I, Morrison J, Ripatti S, Palotie A, Freimer N, Collins F, Mohlke K, Scott L, Fauman E, Burant C, Boehnke M, Laakso M, Wen X. Integrating transcriptomics, metabolomics, and GWAS helps reveal molecular mechanisms for metabolite levels and disease risk. American Journal Of Human Genetics 2022, 109: 1727-1741. PMID: 36055244, PMCID: PMC9606383, DOI: 10.1016/j.ajhg.2022.08.007.Peer-Reviewed Original ResearchConceptsGenome-wide association studiesMolecular mechanismsGWAS resultsDisease traitsGene expressionMetabolic pathwaysTranscriptome-wide associationSame causal variantsMetabolomics resultsTranscriptomic resultsMolecular traitsTranscriptomic dataGTEx projectCausal variantsGlycerophospholipid metabolic pathwayTranscriptomicsAssociation studiesColocalization analysisMetabolite levelsDistinct pathwaysPutative causal effectGenetic variantsGenesUGT1A4 expressionGenetic association