2024
Single-cell genomics and regulatory networks for 388 human brains
Emani P, Liu J, Clarke D, Jensen M, Warrell J, Gupta C, Meng R, Lee C, Xu S, Dursun C, Lou S, Chen Y, Chu Z, Galeev T, Hwang A, Li Y, Ni P, Zhou X, Bakken T, Bendl J, Bicks L, Chatterjee T, Cheng L, Cheng Y, Dai Y, Duan Z, Flaherty M, Fullard J, Gancz M, Garrido-Martín D, Gaynor-Gillett S, Grundman J, Hawken N, Henry E, Hoffman G, Huang A, Jiang Y, Jin T, Jorstad N, Kawaguchi R, Khullar S, Liu J, Liu J, Liu S, Ma S, Margolis M, Mazariegos S, Moore J, Moran J, Nguyen E, Phalke N, Pjanic M, Pratt H, Quintero D, Rajagopalan A, Riesenmy T, Shedd N, Shi M, Spector M, Terwilliger R, Travaglini K, Wamsley B, Wang G, Xia Y, Xiao S, Yang A, Zheng S, Gandal M, Lee D, Lein E, Roussos P, Sestan N, Weng Z, White K, Won H, Girgenti M, Zhang J, Wang D, Geschwind D, Gerstein M, Akbarian S, Abyzov A, Ahituv N, Arasappan D, Almagro Armenteros J, Beliveau B, Berretta S, Bharadwaj R, Bhattacharya A, Brennand K, Capauto D, Champagne F, Chatzinakos C, Chen H, Cheng L, Chess A, Chien J, Clement A, Collado-Torres L, Cooper G, Crawford G, Dai R, Daskalakis N, Davila-Velderrain J, Deep-Soboslay A, Deng C, DiPietro C, Dracheva S, Drusinsky S, Duong D, Eagles N, Edelstein J, Galani K, Girdhar K, Goes F, Greenleaf W, Guo H, Guo Q, Hadas Y, Hallmayer J, Han X, Haroutunian V, He C, Hicks S, Ho M, Ho L, Huang Y, Huuki-Myers L, Hyde T, Iatrou A, Inoue F, Jajoo A, Jiang L, Jin P, Jops C, Jourdon A, Kellis M, Kleinman J, Kleopoulos S, Kozlenkov A, Kriegstein A, Kundaje A, Kundu S, Li J, Li M, Lin X, Liu S, Liu C, Loupe J, Lu D, Ma L, Mariani J, Martinowich K, Maynard K, Myers R, Micallef C, Mikhailova T, Ming G, Mohammadi S, Monte E, Montgomery K, Mukamel E, Nairn A, Nemeroff C, Norton S, Nowakowski T, Omberg L, Page S, Park S, Patowary A, Pattni R, Pertea G, Peters M, Pinto D, Pochareddy S, Pollard K, Pollen A, Przytycki P, Purmann C, Qin Z, Qu P, Raj T, Reach S, Reimonn T, Ressler K, Ross D, Rozowsky J, Ruth M, Ruzicka W, Sanders S, Schneider J, Scuderi S, Sebra R, Seyfried N, Shao Z, Shieh A, Shin J, Skarica M, Snijders C, Song H, State M, Stein J, Steyert M, Subburaju S, Sudhof T, Snyder M, Tao R, Therrien K, Tsai L, Urban A, Vaccarino F, van Bakel H, Vo D, Voloudakis G, Wang T, Wang S, Wang Y, Wei Y, Weimer A, Weinberger D, Wen C, Whalen S, Willsey A, Wong W, Wu H, Wu F, Wuchty S, Wylie D, Yap C, Zeng B, Zhang P, Zhang C, Zhang B, Zhang Y, Ziffra R, Zeier Z, Zintel T. Single-cell genomics and regulatory networks for 388 human brains. Science 2024, 384: eadi5199. PMID: 38781369, PMCID: PMC11365579, DOI: 10.1126/science.adi5199.Peer-Reviewed Original ResearchConceptsSingle-cell genomicsSingle-cell expression quantitative trait locusExpression quantitative trait lociDrug targetsQuantitative trait lociPopulation-level variationSingle-cell expressionCell typesDisease-risk genesTrait lociGene familyRegulatory networksGene expressionCell-typeMultiomics datasetsSingle-nucleiGenomeGenesCellular changesHeterogeneous tissuesExpressionCellsChromatinLociMultiomicsLatent evolutionary signatures: a general framework for analysing music and cultural evolution
Warrell J, Salichos L, Gancz M, Gerstein M. Latent evolutionary signatures: a general framework for analysing music and cultural evolution. Journal Of The Royal Society Interface 2024, 21: 20230647. PMID: 38503341, PMCID: PMC10950459, DOI: 10.1098/rsif.2023.0647.Peer-Reviewed Original ResearchConceptsDomain of musicModeling musical styleCultural processes of changeChord transitionsMusical corporaMusical piecesMusical stylesGenre predictionPrinciples of organizationSongMusicCultural processesCultural evolutionProcess of changeRepresentationGenreDeep generative architecturePiecesStyleHarmonyGenerator architectureEvolutionary spaceLatent embeddingsLatent spaceVariational autoencoderLess-is-more: selecting transcription factor binding regions informative for motif inference
Xu J, Gao J, Ni P, Gerstein M. Less-is-more: selecting transcription factor binding regions informative for motif inference. Nucleic Acids Research 2024, 52: e20-e20. PMID: 38214231, PMCID: PMC10899791, DOI: 10.1093/nar/gkad1240.Peer-Reviewed Original ResearchConceptsChIP-seq signalsChIP-seqGenomic regionsMotif inferenceTranscription factorsTargeting motifTranscription factor binding regionsChIP-seq datasetsNon-specific interactionsC-scoreDNA motifsBinding regionMotifTranscriptionTF signalingAccurate inferenceStronger signalSignalDNARegionTargetInteraction
2020
Data Sanitization to Reduce Private Information Leakage from Functional Genomics
Gürsoy G, Emani P, Brannon CM, Jolanki OA, Harmanci A, Strattan JS, Cherry JM, Miranker AD, Gerstein M. Data Sanitization to Reduce Private Information Leakage from Functional Genomics. Cell 2020, 183: 905-917.e16. PMID: 33186529, PMCID: PMC7672785, DOI: 10.1016/j.cell.2020.09.036.Peer-Reviewed Original ResearchConceptsFunctional genomicsSingle-cell RNA sequencingAccurate reference genomesFunctional genomics datasetsFunctional genomics experimentsOrganismal phenotypesGene regulationReference genomeNext-generation sequencingRaw readsGenomics experimentsRNA sequencingGenomic datasetsGenetic variantsGenomicsKnown individualsSequencingReadsEnvironmental samplesGenomeIlluminaPhenotypeGood statistical powerRegulationStatistical powerPassenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences
Kumar S, Warrell J, Li S, McGillivray PD, Meyerson W, Salichos L, Harmanci A, Martinez-Fundichely A, Chan CWY, Nielsen MM, Lochovsky L, Zhang Y, Li X, Lou S, Pedersen JS, Herrmann C, Getz G, Khurana E, Gerstein MB. Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences. Cell 2020, 180: 915-927.e16. PMID: 32084333, PMCID: PMC7210002, DOI: 10.1016/j.cell.2020.01.032.Peer-Reviewed Original Research
2016
The real cost of sequencing: scaling computation to keep pace with data generation
Muir P, Li S, Lou S, Wang D, Spakowicz DJ, Salichos L, Zhang J, Weinstock GM, Isaacs F, Rozowsky J, Gerstein M. The real cost of sequencing: scaling computation to keep pace with data generation. Genome Biology 2016, 17: 53. PMID: 27009100, PMCID: PMC4806511, DOI: 10.1186/s13059-016-0917-0.Peer-Reviewed Original Research
2013
Integrative Annotation of Variants from 1092 Humans: Application to Cancer Genomics
Khurana E, Fu Y, Colonna V, Mu XJ, Kang HM, Lappalainen T, Sboner A, Lochovsky L, Chen J, Harmanci A, Das J, Abyzov A, Balasubramanian S, Beal K, Chakravarty D, Challis D, Chen Y, Clarke D, Clarke L, Cunningham F, Evani US, Flicek P, Fragoza R, Garrison E, Gibbs R, Gümüş ZH, Herrero J, Kitabayashi N, Kong Y, Lage K, Liluashvili V, Lipkin SM, MacArthur DG, Marth G, Muzny D, Pers TH, Ritchie GRS, Rosenfeld JA, Sisu C, Wei X, Wilson M, Xue Y, Yu F, Consortium 1, Dermitzakis ET, Yu H, Rubin MA, Tyler-Smith C, Gerstein M. Integrative Annotation of Variants from 1092 Humans: Application to Cancer Genomics. Science 2013, 342: 1235587. PMID: 24092746, PMCID: PMC3947637, DOI: 10.1126/science.1235587.Peer-Reviewed Original Research
2012
Architecture of the human regulatory network derived from ENCODE data
Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, Lacroute P, Leng J, Lian J, Monahan H, O’Geen H, Ouyang Z, Partridge EC, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy TE, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham PJ, Myers RM, Weissman SM, Snyder M. Architecture of the human regulatory network derived from ENCODE data. Nature 2012, 489: 91-100. PMID: 22955619, PMCID: PMC4154057, DOI: 10.1038/nature11245.Peer-Reviewed Original ResearchMeSH KeywordsAllelesCell LineDNAEncyclopedias as TopicGATA1 Transcription FactorGene Expression ProfilingGene Regulatory NetworksGenome, HumanGenomicsHumansK562 CellsMolecular Sequence AnnotationOrgan SpecificityPhosphorylationPolymorphism, Single NucleotideProtein Interaction MapsRegulatory Sequences, Nucleic AcidRNA, UntranslatedSelection, GeneticTranscription FactorsTranscription Initiation SiteConceptsTranscription factorsRegulatory networksHuman transcriptional regulatory networkHuman regulatory networkSpecific genomic locationsTranscription-related factorsState of genesTranscriptional regulatory networksAllele-specific activityPersonal genome sequencesGenomic locationStrong selectionGenome sequenceENCODE dataGenomic informationInformation-flow bottlenecksRegulatory informationConnected network componentsCombinatorial fashionInfluences expressionHuman biologyBinding informationNetwork motifsCo-associationGenes
2011
Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
Greenbaum D, Sboner A, Mu XJ, Gerstein M. Genomics and Privacy: Implications of the New Reality of Closed Data for the Field. PLOS Computational Biology 2011, 7: e1002278. PMID: 22144881, PMCID: PMC3228779, DOI: 10.1371/journal.pcbi.1002278.Peer-Reviewed Original ResearchConceptsPrivacy issuesGenomic privacySecure cloud computing environmentCloud computing environmentPersonal genomic dataComputing environmentPrivacy problemsPrivate dataData securityPrivacy concernsOpen sourceOpen dataPrivacyLarge datasetsImportant data setsVariant informationClosed dataData setsFuture accessSmall labsDatasetGenome CenterGenomic dataComputational strategyLarge scale
2010
Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project
Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK, Feng X, Leng J, Vielle A, Niu W, Rhrissorrakrai K, Agarwal A, Alexander RP, Barber G, Brdlik CM, Brennan J, Brouillet JJ, Carr A, Cheung MS, Clawson H, Contrino S, Dannenberg LO, Dernburg AF, Desai A, Dick L, Dosé AC, Du J, Egelhofer T, Ercan S, Euskirchen G, Ewing B, Feingold EA, Gassmann R, Good PJ, Green P, Gullier F, Gutwein M, Guyer MS, Habegger L, Han T, Henikoff JG, Henz SR, Hinrichs A, Holster H, Hyman T, Iniguez AL, Janette J, Jensen M, Kato M, Kent WJ, Kephart E, Khivansara V, Khurana E, Kim JK, Kolasinska-Zwierz P, Lai EC, Latorre I, Leahey A, Lewis S, Lloyd P, Lochovsky L, Lowdon RF, Lubling Y, Lyne R, MacCoss M, Mackowiak SD, Mangone M, McKay S, Mecenas D, Merrihew G, Miller DM, Muroyama A, Murray JI, Ooi SL, Pham H, Phippen T, Preston EA, Rajewsky N, Rätsch G, Rosenbaum H, Rozowsky J, Rutherford K, Ruzanov P, Sarov M, Sasidharan R, Sboner A, Scheid P, Segal E, Shin H, Shou C, Slack FJ, Slightam C, Smith R, Spencer WC, Stinson EO, Taing S, Takasaki T, Vafeados D, Voronina K, Wang G, Washington NL, Whittle CM, Wu B, Yan KK, Zeller G, Zha Z, Zhong M, Zhou X, Consortium M, Ahringer J, Strome S, Gunsalus KC, Micklem G, Liu XS, Reinke V, Kim SK, Hillier LW, Henikoff S, Piano F, Snyder M, Stein L, Lieb JD, Waterston RH. Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project. Science 2010, 330: 1775-1787. PMID: 21177976, PMCID: PMC3142569, DOI: 10.1126/science.1196914.Peer-Reviewed Original ResearchMeSH KeywordsAnimalsCaenorhabditis elegansCaenorhabditis elegans ProteinsChromatinChromosomesComputational BiologyConserved SequenceEvolution, MolecularGene Expression ProfilingGene Expression RegulationGene Regulatory NetworksGenes, HelminthGenome, HelminthGenomicsHistonesModels, GeneticMolecular Sequence AnnotationRegulatory Sequences, Nucleic AcidRNA, HelminthRNA, UntranslatedTranscription FactorsConceptsAccurate gene modelsGenome-wide identificationTranscription factor-binding sitesKey model organismTranscription factor bindingAlternative splice formsFactor-binding sitesChromatin compositionModENCODE projectChromatin organizationHistone modificationsGenome annotationModel organismsNematode CaenorhabditisChromosomal locationPutative functionsGene modelsTranscriptome profilingChromosome armsTranscription factorsNoncoding RNAsFactor bindingSplice formsX chromosomeGene expression
2009
PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls
Rozowsky J, Euskirchen G, Auerbach RK, Zhang ZD, Gibson T, Bjornson R, Carriero N, Snyder M, Gerstein MB. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nature Biotechnology 2009, 27: 66-75. PMID: 19122651, PMCID: PMC2924752, DOI: 10.1038/nbt.1518.Peer-Reviewed Original Research
2004
Genomic analysis of regulatory network dynamics reveals large topological changes
Luscombe NM, Madan Babu M, Yu H, Snyder M, Teichmann SA, Gerstein M. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 2004, 431: 308-312. PMID: 15372033, DOI: 10.1038/nature02782.Peer-Reviewed Original ResearchConceptsTranscription factorsActive transcription factorRegulatory network dynamicsBiological networksHigher eukaryotesLarge-scale topological changesGenomic scaleGenomic analysisCell cycleDiverse stimuliEnvironmental responsesMolecular biologyFast signal propagationTwo-tiered hierarchyNetwork analysisGlobal topological measuresLocal motifsSub-network structureEukaryotesTemporal progression