2024
GENCODE 2025: reference gene annotation for human and mouse
Mudge J, Carbonell-Sala S, Diekhans M, Martinez J, Hunt T, Jungreis I, Loveland J, Arnan C, Barnes I, Bennett R, Berry A, Bignell A, Cerdán-Vélez D, Cochran K, Cortés L, Davidson C, Donaldson S, Dursun C, Fatima R, Hardy M, Hebbar P, Hollis Z, James B, Jiang Y, Johnson R, Kaur G, Kay M, Mangan R, Maquedano M, Gómez L, Mathlouthi N, Merritt R, Ni P, Palumbo E, Perteghella T, Pozo F, Raj S, Sisu C, Steed E, Sumathipala D, Suner M, Uszczynska-Ratajczak B, Wass E, Yang Y, Zhang D, Finn R, Gerstein M, Guigó R, Hubbard T, Kellis M, Kundaje A, Paten B, Tress M, Birney E, Martin F, Frankish A. GENCODE 2025: reference gene annotation for human and mouse. Nucleic Acids Research 2024, gkae1078. PMID: 39565199, DOI: 10.1093/nar/gkae1078.Peer-Reviewed Original ResearchGene annotationLong-read transcriptome sequencingMulti-genome alignmentsRibo-Seq experimentsUCSC Genome BrowserState-of-the-art proteomicsGenome browserRibo-seqSpecies genomesMouse genomeTranscriptome sequencingGENCODEGenomeAnnotation workflowAnnotationSequencePangenomeMiceGenesetsState-of-the-artUCSCProteomicsTranscriptionGenesSpeciesTranscriptional determinism and stochasticity contribute to the complexity of autism-associated SHANK family genes
Lu X, Ni P, Suarez-Meade P, Ma Y, Forrest E, Wang G, Wang Y, Quiñones-Hinojosa A, Gerstein M, Jiang Y. Transcriptional determinism and stochasticity contribute to the complexity of autism-associated SHANK family genes. Cell Reports 2024, 43: 114376. PMID: 38900637, PMCID: PMC11328446, DOI: 10.1016/j.celrep.2024.114376.Peer-Reviewed Original ResearchSHANK family genesFamily genesLong-read sequencingCDNA captureTranscript structureDeleterious variantsGenomic studiesAbundant mRNAsTranscriptional dysregulationStochastic transcriptionStudies of neuropsychiatric disordersCausative genesTranscriptional profilesTranscriptional determinantsTranscriptomePostmortem brain tissueAutism spectrum disorderShank3 transcriptsTranscriptionGenesGenomeSHANK3Neuropsychiatric disordersSpectrum disorderAutism modelSingle-cell genomics and regulatory networks for 388 human brains
Emani P, Liu J, Clarke D, Jensen M, Warrell J, Gupta C, Meng R, Lee C, Xu S, Dursun C, Lou S, Chen Y, Chu Z, Galeev T, Hwang A, Li Y, Ni P, Zhou X, Bakken T, Bendl J, Bicks L, Chatterjee T, Cheng L, Cheng Y, Dai Y, Duan Z, Flaherty M, Fullard J, Gancz M, Garrido-Martín D, Gaynor-Gillett S, Grundman J, Hawken N, Henry E, Hoffman G, Huang A, Jiang Y, Jin T, Jorstad N, Kawaguchi R, Khullar S, Liu J, Liu J, Liu S, Ma S, Margolis M, Mazariegos S, Moore J, Moran J, Nguyen E, Phalke N, Pjanic M, Pratt H, Quintero D, Rajagopalan A, Riesenmy T, Shedd N, Shi M, Spector M, Terwilliger R, Travaglini K, Wamsley B, Wang G, Xia Y, Xiao S, Yang A, Zheng S, Gandal M, Lee D, Lein E, Roussos P, Sestan N, Weng Z, White K, Won H, Girgenti M, Zhang J, Wang D, Geschwind D, Gerstein M, Akbarian S, Abyzov A, Ahituv N, Arasappan D, Almagro Armenteros J, Beliveau B, Berretta S, Bharadwaj R, Bhattacharya A, Brennand K, Capauto D, Champagne F, Chatzinakos C, Chen H, Cheng L, Chess A, Chien J, Clement A, Collado-Torres L, Cooper G, Crawford G, Dai R, Daskalakis N, Davila-Velderrain J, Deep-Soboslay A, Deng C, DiPietro C, Dracheva S, Drusinsky S, Duong D, Eagles N, Edelstein J, Galani K, Girdhar K, Goes F, Greenleaf W, Guo H, Guo Q, Hadas Y, Hallmayer J, Han X, Haroutunian V, He C, Hicks S, Ho M, Ho L, Huang Y, Huuki-Myers L, Hyde T, Iatrou A, Inoue F, Jajoo A, Jiang L, Jin P, Jops C, Jourdon A, Kellis M, Kleinman J, Kleopoulos S, Kozlenkov A, Kriegstein A, Kundaje A, Kundu S, Li J, Li M, Lin X, Liu S, Liu C, Loupe J, Lu D, Ma L, Mariani J, Martinowich K, Maynard K, Myers R, Micallef C, Mikhailova T, Ming G, Mohammadi S, Monte E, Montgomery K, Mukamel E, Nairn A, Nemeroff C, Norton S, Nowakowski T, Omberg L, Page S, Park S, Patowary A, Pattni R, Pertea G, Peters M, Pinto D, Pochareddy S, Pollard K, Pollen A, Przytycki P, Purmann C, Qin Z, Qu P, Raj T, Reach S, Reimonn T, Ressler K, Ross D, Rozowsky J, Ruth M, Ruzicka W, Sanders S, Schneider J, Scuderi S, Sebra R, Seyfried N, Shao Z, Shieh A, Shin J, Skarica M, Snijders C, Song H, State M, Stein J, Steyert M, Subburaju S, Sudhof T, Snyder M, Tao R, Therrien K, Tsai L, Urban A, Vaccarino F, van Bakel H, Vo D, Voloudakis G, Wang T, Wang S, Wang Y, Wei Y, Weimer A, Weinberger D, Wen C, Whalen S, Willsey A, Wong W, Wu H, Wu F, Wuchty S, Wylie D, Yap C, Zeng B, Zhang P, Zhang C, Zhang B, Zhang Y, Ziffra R, Zeier Z, Zintel T. Single-cell genomics and regulatory networks for 388 human brains. Science 2024, 384: eadi5199. PMID: 38781369, PMCID: PMC11365579, DOI: 10.1126/science.adi5199.Peer-Reviewed Original ResearchConceptsSingle-cell genomicsSingle-cell expression quantitative trait locusExpression quantitative trait lociDrug targetsQuantitative trait lociPopulation-level variationSingle-cell expressionCell typesDisease-risk genesTrait lociGene familyRegulatory networksGene expressionCell-typeMultiomics datasetsSingle-nucleiGenomeGenesCellular changesHeterogeneous tissuesExpressionCellsChromatinLociMultiomicsscENCORE: leveraging single-cell epigenetic data to predict chromatin conformation using graph embedding
Duan Z, Xu S, Srinivasan S, Hwang A, Lee C, Yue F, Gerstein M, Luan Y, Girgenti M, Zhang J. scENCORE: leveraging single-cell epigenetic data to predict chromatin conformation using graph embedding. Briefings In Bioinformatics 2024, 25: bbae096. PMID: 38493342, PMCID: PMC10944576, DOI: 10.1093/bib/bbae096.Peer-Reviewed Original ResearchConceptsA/B compartmentsEpigenetic dataChromatin interaction frequenciesCell type-specific mannerChromatin conformational changesGenome binsGenomic regionsChromatin conformationEukaryotic DNAChromatin compartmentsDynamic compartmentalizationRepressed stateGenetic blueprintTranscriptional programsTranscriptional changesChromatinConformational changesComplex tissuesInteraction frequencyCompartmentGenomeChromosomeStructural heterogeneityDNAA/BFAVOR-GPT: a generative natural language interface to whole genome variant functional annotations
Li T, Zhou H, Verma V, Tang X, Shao Y, Van Buren E, Weng Z, Gerstein M, Neale B, Sunyaev S, Lin X. FAVOR-GPT: a generative natural language interface to whole genome variant functional annotations. Bioinformatics Advances 2024, 4: vbae143. PMID: 39387060, PMCID: PMC11461909, DOI: 10.1093/bioadv/vbae143.Peer-Reviewed Original ResearchVariant functional annotationFunctional annotationNatural language interfaceFunctional annotation dataDisease-associated variantsLanguage interfaceWhole genomeFunctional prioritizationGenomeUser promptsRetrieval frameworkLanguage modelRaw annotationsAnnotated dataAnnotationUsersRetrievalOnline resourcesChatbotInformation interpretationUsabilityVariantsDatabase
2020
Data Sanitization to Reduce Private Information Leakage from Functional Genomics
Gürsoy G, Emani P, Brannon CM, Jolanki OA, Harmanci A, Strattan JS, Cherry JM, Miranker AD, Gerstein M. Data Sanitization to Reduce Private Information Leakage from Functional Genomics. Cell 2020, 183: 905-917.e16. PMID: 33186529, PMCID: PMC7672785, DOI: 10.1016/j.cell.2020.09.036.Peer-Reviewed Original ResearchConceptsFunctional genomicsSingle-cell RNA sequencingAccurate reference genomesFunctional genomics datasetsFunctional genomics experimentsOrganismal phenotypesGene regulationReference genomeNext-generation sequencingRaw readsGenomics experimentsRNA sequencingGenomic datasetsGenetic variantsGenomicsKnown individualsSequencingReadsEnvironmental samplesGenomeIlluminaPhenotypeGood statistical powerRegulationStatistical power
2014
Comparative analysis of pseudogenes across three phyla
Sisu C, Pei B, Leng J, Frankish A, Zhang Y, Balasubramanian S, Harte R, Wang D, Rutenberg-Schoenberg M, Clark W, Diekhans M, Rozowsky J, Hubbard T, Harrow J, Gerstein MB. Comparative analysis of pseudogenes across three phyla. Proceedings Of The National Academy Of Sciences Of The United States Of America 2014, 111: 13361-13366. PMID: 25157146, PMCID: PMC4169933, DOI: 10.1073/pnas.1407293111.Peer-Reviewed Original ResearchConceptsGenome evolutionLarge effective population sizesNumerous duplication eventsProtein-coding genesNumber of pseudogenesEffective population sizePotential regulatory roleDuplication eventsSelective sweepsGene familyHuman pseudogenesPrimate lineageDifferent remodeling processesPseudogenesUpstream sequencesHigh deletion ratePromoter activityBiochemical activityRegulatory rolePopulation sizePartial activityGenomePhylaDeletion rateLineages
2010
Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks
Yan KK, Fang G, Bhardwaj N, Alexander RP, Gerstein M. Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks. Proceedings Of The National Academy Of Sciences Of The United States Of America 2010, 107: 9186-9191. PMID: 20439753, PMCID: PMC2889091, DOI: 10.1073/pnas.0914771107.Peer-Reviewed Original ResearchConceptsTranscriptional regulatory networksRegulatory networksCellular design principlesCall graphEvolutionary ratesGlobal regulatorOperating systemRandom mutationsSoftware systemsLiving organismBiological evolutionRapid evolutionSubsequent selectionFunctional modulesComputer operating systemsRegulatorNetwork hubsBiological systemsDesign principlesControl networkGeneric componentsHierarchical layoutGenomeEvolutionTerms of topology