Gur Yaari
Associate Professor of PathologyCards
Appointments
Contact Info
About
Copy Link
Titles
Associate Professor of Pathology
Biography
Gur Yaari is an Associate Professor in the Department of Pathology, Yale School of Medicine. He received his B.Sc. degree in physics and math, M.Sc. in high energy physics, and Ph.D. in interdisciplinary physics, all from HUJI. He was a postdoctoral fellow at Yale University, and served as an assistant, associate and full professor at Bar Ilan Univerrsity from 2013 till 2025. His current research interest focuses on the development of computational and statistical tools to process and analyze high-throughput biological data, with a special spotlight on the adaptive immune system.
Appointments
Pathology
Associate Professor on TermPrimary
Other Departments & Organizations
Research
Copy Link
Overview
Our lab develops computational and statistical tools to process and analyze high-throughput biological data. The research is multidisciplinary and involves elements from mathematics, statistics, physics, computer science, biology and medicine. Our main focus is studying the adaptive immune system from a system/repertoire perspective. In particular, we are interested in understanding lymphocyte (T and B cells) repertoire dynamics in healthy individuals as well as in illness states such as infections, autoimmune diseases, aging and cancer. We apply advanced molecular biology methods to produce large sequencing data sets of human lymphocyte receptors, and analyze them using dedicated computational pipelines, in order to obtain meaningful biological insights into the adaptive immune system.
Public Health Interests
ORCID
0000-0001-9311-9884- View Lab Website
Google Scholar
Research at a Glance
Yale Co-Authors
Publications Timeline
Steven Kleinstein, PhD
Ayelet Peres
David A. Hafler, MD, FANA, MSc
Anita Huttner, MD
Hailong Meng, PhD
Kevin C O'Connor, PhD
Publications
Featured Publications
The current landscape of adaptive immune receptor genomic and repertoire data: OGRDB and VDJbase
Lees W, Peres A, Klein V, Amos N, Jana U, Engelbrecht E, Vanwinkle Z, Malach Y, Konstantinovsky T, Polak P, Watson C, Yaari G. The current landscape of adaptive immune receptor genomic and repertoire data: OGRDB and VDJbase. Nucleic Acids Research 2025, 54: d932-d937. PMID: 41206474, PMCID: PMC12807691, DOI: 10.1093/nar/gkaf1094.Peer-Reviewed Original ResearchCitationsAltmetricConceptsT-cell receptor lociHigh-throughput sequencingAdaptive immune receptor repertoireAdaptive immune receptorsPrecision medicine applicationsImmune receptor repertoiresReference sequenceAllelic variationReceptor locusImmune receptorsRepertoire dataAntibody engineeringReceptor diversitySequenceDisease biomarkersEvidence-based analysisReceptor repertoireImmune responseStructural complexityVaccine designRepertoireLociImmunological researchMedicine applicationsSpeciesEnhancing sequence alignment of adaptive immune receptors through multi-task deep learning
Konstantinovsky T, Peres A, Eisenberg R, Polak P, Lindenbaum O, Yaari G. Enhancing sequence alignment of adaptive immune receptors through multi-task deep learning. Nucleic Acids Research 2025, 53: gkaf651. PMID: 40650972, PMCID: PMC12255302, DOI: 10.1093/nar/gkaf651.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsSequence alignmentAIRR-seqSequence segmentsEfficient local processingAssignment accuracySomatic hypermutationState-of-the-art resultsMulti-task learning frameworkAdaptive immune receptor repertoire sequencingMillions of sequencesMulti-task deep learningAdaptive immune receptor repertoireModel’s latent spaceAdaptive immune receptorsAntibody engineeringImmunoglobulin (Ig) sequencesImmune receptor repertoiresSequence variabilityComputational analysisV(D)J recombinationContainer imagesLatent spaceLearning frameworkDeep learningImmune receptorsIGHV allele similarity clustering improves genotype inference from adaptive immune receptor repertoire sequencing data
Peres A, Lees W, Rodriguez O, Lee N, Polak P, Hope R, Kedmi M, Collins A, Ohlin M, Kleinstein S, Watson C, Yaari G. IGHV allele similarity clustering improves genotype inference from adaptive immune receptor repertoire sequencing data. Nucleic Acids Research 2023, 51: e86-e86. PMID: 37548401, PMCID: PMC10484671, DOI: 10.1093/nar/gkad603.Peer-Reviewed Original ResearchCitationsAltmetricImmune2vec: Embedding B/T Cell Receptor Sequences in ℝ N Using Natural Language Processing
Ostrovsky-Berman M, Frankel B, Polak P, Yaari G. Immune2vec: Embedding B/T Cell Receptor Sequences in ℝ N Using Natural Language Processing. Frontiers In Immunology 2021, 12: 680687. PMID: 34367141, PMCID: PMC8340020, DOI: 10.3389/fimmu.2021.680687.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsSequence dataGene family classificationDiverse repertoireNatural language processingLow-dimensional representationMulti-dimensional dataN-gramEmbedding spacesRelevant informationVector-spaceB cell receptorLanguage processingN-gram propertiesEmbedding techniqueExploratory data analysisBCR sequencesBiological informationFamily classificationPathogenic patternsReceptor sequencesSequenceCell receptorsTextual sequenceImmune repertoireRepertoire of TMosaic deletion patterns of the human antibody heavy chain gene locus shown by Bayesian haplotyping
Gidoni M, Snir O, Peres A, Polak P, Lindeman I, Mikocziova I, Sarna V, Lundin K, Clouser C, Vigneault F, Collins A, Sollid L, Yaari G. Mosaic deletion patterns of the human antibody heavy chain gene locus shown by Bayesian haplotyping. Nature Communications 2019, 10: 628. PMID: 30733445, PMCID: PMC6367474, DOI: 10.1038/s41467-019-08489-3.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsAntibody repertoire sequencing dataAnalysis of antibody repertoiresCopy number variationsHigh-throughput sequencingNaive B cell repertoireHeavy chain gene locusUsage biasGene assignmentGenomic lociHaplotype inferenceGenetic disease predispositionSequence dataGene locusNumber variationsDisease predispositionHaplotypesDeletion patternsImmunoglobulin genesAntibody repertoireLociAdaptive immune responsesGenesB cell repertoireIGHV genesVDJQuantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations
Yaari G, Bolen CR, Thakar J, Kleinstein SH. Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations. Nucleic Acids Research 2013, 41: e170-e170. PMID: 23921631, PMCID: PMC3794608, DOI: 10.1093/nar/gkt660.Peer-Reviewed Original ResearchCitationsAltmetricModels of Somatic Hypermutation Targeting and Substitution Based on Synonymous Mutations from High-Throughput Immunoglobulin Sequencing Data
Yaari G, Vander Heiden J, Uduman M, Gadala-Maria D, Gupta N, Stern JN, O’Connor K, Hafler DA, Laserson U, Vigneault F, Kleinstein SH. Models of Somatic Hypermutation Targeting and Substitution Based on Synonymous Mutations from High-Throughput Immunoglobulin Sequencing Data. Frontiers In Immunology 2013, 4: 358. PMID: 24298272, PMCID: PMC3828525, DOI: 10.3389/fimmu.2013.00358.Peer-Reviewed Original ResearchCitationsAltmetricConceptsAccurate background modelSynonymous mutationsNon-coding regionsParticular codon usageNon-functional sequencesComputational analysis methodsObserved mutation patternExisting modelsBackground modelInfluence of selectionCodon usageSHM targetingBase compositionImproved modelSequencing dataNucleotide substitutionsAnalysis methodStatistical analysisFunctional sequencesMutation targetingB-cell cancersModelSomatic hypermutation patternsMutationsHypermutation patterns
2026
Revealing the inherent design principles of the genetic code via an error correcting code representation
Aharon A, Polak P, Yaari G. Revealing the inherent design principles of the genetic code via an error correcting code representation. Scientific Reports 2026 PMID: 41741536, DOI: 10.1038/s41598-026-39862-0.Peer-Reviewed Original ResearchCitationsConceptsError-correcting codesGenetic codeNucleotide substitutionsCommunication systemsMutation patternsAmino acid propertiesCommunication system elementsCodon levelCode representationPoint mutationsLiving speciesCodonAmino acidsCoding theoryMutationsBiological importanceAcid mappingNucleotideCommunication engineeringReverse-engineering approachCodeAminoPopulation-level genomic analysis of immunoglobulin loci variation in rhesus macaques reveals extensive germline diversity
Peres A, Upadhyay A, Klein V, Saha S, Rodriguez O, Vanwinkle Z, Karunakaran K, Metz A, Lauer W, Lin M, Melton T, Granholm L, Polak P, Peterson S, Peterson E, Raju N, Shields K, Schultze S, Ton T, Ericsen A, Lapp S, Villinger F, Ohlin M, Cottrell C, Amara R, Derdeyn C, Crotty S, Schief W, Karlsson Hedestam G, Smith M, Lees W, Watson C, Yaari G, Bosinger S. Population-level genomic analysis of immunoglobulin loci variation in rhesus macaques reveals extensive germline diversity. Immunity 2026, 59: 213-228.e6. PMID: 41494536, DOI: 10.1016/j.immuni.2025.12.002.Peer-Reviewed Original ResearchAltmetricMeSH Keywords and ConceptsConceptsRecombination signal sequencesImmunoglobulin (IG) lociGenome sequence dataAntibody-encoding genesSequence dataIg allelesGermline diversitySignal sequenceIg lociLocus variationGenetic resourcesHuman diseasesAllelesAntibody repertoireLociIndian-origin rhesus macaquesRhesus macaquesIg heavy-DiversityNeutralizing antibody responsesGenesGermlineAntibody responseVaccine discoverySequence
2025
The gremlin in the works: why T cell receptor researchers need to pay more attention to germline reference sequences
Heather J, Peres A, Yaari G, Lees W. The gremlin in the works: why T cell receptor researchers need to pay more attention to germline reference sequences. ImmunoInformatics 2025, 20: 100058. DOI: 10.1016/j.immuno.2025.100058.Peer-Reviewed Original ResearchCitationsAltmetric
Academic Achievements & Community Involvement
Copy Link
Activities
activity Immunoinformatics
2022 - PresentJournal ServiceEditor-in-Chief
Get In Touch
Copy Link
Contacts
Locations
300 George Street
Academic Office
Fl 5th, Ste 505
New Haven, CT 06511