Mapping Risk Genes for Psychiatric Disorders in Large Samples to Understand Biology
April 30, 2021ID6554
To CiteDCA Citation Guide
- 00:00Our next speaker is Doctor Joel Gelernter.
- 00:03Briefly, he graduated from Yale College,
- 00:05went to medical school at SUNY Downstate,
- 00:08and then did a residency in psychiatry at
- 00:11Western Psychiatric Institute and Clinic,
- 00:13followed by a fellowship in
- 00:15psychiatric genetics at NIMH.
- 00:17He is currently the foundation's
- 00:19fund professor of Psychiatry,
- 00:20professor of genetics and Neuroscience
- 00:22and director of the Division of Human
- 00:25Genetics in Psychiatry here at Yale,
- 00:27where he studies the genetics of
- 00:30psychiatric and substance use disorders.
- 00:32Thank you Doctor Gordon, thank
- 00:33you very much and I don't have
- 00:36relevant conflicts of interest.
- 00:37So I'm going to talk about
- 00:40where we stand with genome wide
- 00:42Association studies and psychiatry.
- 00:44And it's just been an enormously
- 00:47eventful few years as large studies
- 00:50have taken hold in the field.
- 00:52This has been due to the advent
- 00:55of large biobanks available for
- 00:57research like the UK Biobank Million
- 01:00Veteran Program and Mega Analysis
- 01:02in front of those are the work from
- 01:06the Psychiatric Genomics Consortium.
- 01:08It's also the case that's what's considered
- 01:10a large study is evolving rapidly,
- 01:13so studies of over a million subjects
- 01:16are not at all uncommon now.
- 01:19So we like to identify lots of
- 01:21genes because that's interesting and
- 01:23we want to see what they encode.
- 01:26But we also once we have genome wide data,
- 01:29want to go beyond that and see what we
- 01:32can use that data for to learn about
- 01:35the biology that comes out of well powered,
- 01:38G was we could look at Lee Atropia
- 01:40shared genetics with other disorders,
- 01:43pathways,
- 01:43mechanisms and also better understanding
- 01:45of the meaning of diagnosis.
- 01:48If we look at the current state of the
- 01:51world of Genome Wide Association analysis,
- 01:55this is.
- 01:57Diagram that's updated regularly at the you
- 02:00know gene Human Genome Institute website.
- 02:03Now there are so many significantly
- 02:06map genome wide associated loci that
- 02:09you can only fit a few chromosomes
- 02:12on a single slide.
- 02:14But going back only 16 years,
- 02:17this is the first important genome
- 02:20wide Association study came out in
- 02:22science and we can focus in and look
- 02:25on the sample size that was only 96
- 02:28cases and 50 controls which amazingly
- 02:31still work for a somewhat complex trait.
- 02:34For years after that,
- 02:36there wasn't a whole lot of success
- 02:38in Genome Wide Association analysis
- 02:40of complex traits an all psychiatric
- 02:43disorders or complex trades,
- 02:45and there was discussion about whether
- 02:47complex traits just weren't amenable
- 02:49to genome wide Association analysis.
- 02:51There was some other problem that
- 02:53they were just too heterogeneous.
- 02:55It turns out that it was more than
- 02:58anything else, a problem of power,
- 03:01and that with lower sample size
- 03:04studies up to a few 1000 cases.
- 03:06For complex traits,
- 03:08you have to keep adding cases
- 03:10until you come to what appears to
- 03:13be an inflection point,
- 03:15and once you hit the inflection point,
- 03:18which is about 10,000 cases for schizophrenia
- 03:21for every set of subjects you add,
- 03:23you discover more risk loci regularly,
- 03:26and this inflection point was simply
- 03:29much earlier on for complex trades
- 03:32outside of the realm of psychiatry.
- 03:35This was put forward by Pat
- 03:37Sullivan from PGC a few years back,
- 03:41and it's continued to hold.
- 03:43Here's an example from depression work,
- 03:46where until a few years ago there
- 03:48was nothing from depression,
- 03:51and now we've gone up 278 risk loci,
- 03:54but that took a very large sample size,
- 03:58over 300,000.
- 04:01So most of the progress for
- 04:03psychiatric traits comes from massive
- 04:05meta analysis and biobank samples.
- 04:08Uh, the UK biobank is very well known.
- 04:11It has half million subjects,
- 04:13self report surveys on a subset of them,
- 04:17electronic health record linkage and a
- 04:19very open design where anyone can obtain
- 04:22access to the data and do analysis
- 04:25and this is generated huge numbers
- 04:27of very important research papers.
- 04:30There's 23 andme the direct to
- 04:32consumer Testing Company, which has
- 04:35millions of subjects and is growing.
- 04:38Despite shallow diagnostic ascertainment,
- 04:41it still proved to be enormously
- 04:45useful for common trades and then
- 04:49the Psychiatric Genomics consortium.
- 04:52In addition, there are some important
- 04:55academic center related biobanks,
- 04:56like the one at Harvard Partners,
- 04:59the Bio V1 led by Nancy Cox,
- 05:02who previously was at Yale and soon Yell.
- 05:06The million veteran program until
- 05:08recently was somewhat less well known,
- 05:10but in some ways it's the most
- 05:13useful of the bio banks,
- 05:15so we can look at some of the characteristics
- 05:17of the MVP sample large sample size
- 05:20they've already recruited over 800,000.
- 05:22The original plan was to
- 05:24recruit a million participants,
- 05:26but they are now planning to
- 05:28go well beyond that.
- 05:30Most of the larger samples that
- 05:33are available for study are very
- 05:35predominantly European ancestry samples.
- 05:38The MVP has very good representation
- 05:40from non European ancestry subjects,
- 05:43especially African American and Latin X,
- 05:45and I'm going to give an example of why
- 05:49that's important for gene discovery.
- 05:52It's mostly male.
- 05:53That's a disadvantage in that we
- 05:56can't say much about females from MVP,
- 05:59but it does increase.
- 06:01The homogeneity of the sample and increase
- 06:04our power for discovery among males.
- 06:07There is EHR linkage.
- 06:08The VAEHR is one of the oldest worldwide.
- 06:12Goes back to the 90s and so there is
- 06:16longitudinal repeated measured data from EHR.
- 06:19Depending on the phenotype.
- 06:22And finally,
- 06:23the MVP sample is relatively old and SEK.
- 06:27There are individuals who have
- 06:29used VA health services and this
- 06:32is in contrast to other large
- 06:34biobanks like the UK Biobank,
- 06:36which is relatively higher socioeconomic
- 06:39status and not quite as sick.
- 06:43So I'm going to give examples for
- 06:45gene mapping into psychiatric areas.
- 06:48First post Traumatic stress disorder
- 06:51and then alcohol use disorder.
- 06:54So one of the goals of collecting the
- 06:57MVP sample was to map disorders of
- 07:00relevance to the veteran population.
- 07:03An although PTSD is not specific
- 07:05to military populations,
- 07:07it's very important in those populations.
- 07:11We have the HR data relating to
- 07:13PTSD in the MVP and we also have PCL
- 07:17checklist data that we can use as a
- 07:21quantitative measure of PTSD symptoms.
- 07:24PCL is often used clinically and it
- 07:27breaks down into three different
- 07:29sub phenotypes re experiencing
- 07:32avoidance and hyperarousal.
- 07:34Which in the original clinical
- 07:36conception of PTSD,
- 07:37were thought to be important
- 07:39for diagnosis of the trade,
- 07:41even though phenomenologically
- 07:42they appear to be quite different,
- 07:45so I'm going to come back to that.
- 07:48But first I'm going to talk about PTSD
- 07:52Re experiencing which was the subject
- 07:54of our first paper from NVP about PTSD.
- 07:57So sample re experiencing
- 07:59item is given below.
- 08:01How much have you been bothered
- 08:03by repeated disturbing?
- 08:04And unwanted memories of
- 08:07the stressful experience.
- 08:08Here's the Manhattan Plot,
- 08:108 distinct common variant genome
- 08:13wide significant regions identified.
- 08:16Three with very high
- 08:18statistical significance,
- 08:18and I've already described
- 08:20this as a Manhattan plot.
- 08:22This is why in this Manhattan plot only
- 08:25the World Trade Center is significant
- 08:28over 5 * 10 to the minus South.
- 08:32So I'm going to focus in on only
- 08:35one of those results from our re
- 08:38experiencing G was and that is the
- 08:41CR H R1 region and in the upper plot
- 08:44this is called original Manhattan.
- 08:45Plot it zooming in on the Manhattan
- 08:48plot to show chromosome or region.
- 08:51We have a region of high
- 08:53statistical significance,
- 08:54but we can't pick out a risk gene,
- 08:57much less or risk locus that
- 08:59Maps higher than the others.
- 09:02We have an African ancestry sample
- 09:05below that's about 20% the size
- 09:08of the European ancestry sample.
- 09:11Not large enough to show
- 09:14statistical significance.
- 09:16But European Americans have higher LD
- 09:18and a common inversion in this region.
- 09:21African Americans have lower
- 09:23linkage disequilibrium genome wide,
- 09:25and this particular inversion in
- 09:27this region is much less common,
- 09:29and the result is that when we
- 09:32met it analyze the data from
- 09:35European and African Americans,
- 09:37we can now localize the signal in
- 09:39this region to a specific gene
- 09:42with about two orders of magnitude,
- 09:45two log units of support.
- 09:47So it's a pretty good localization.
- 09:51So I want to illustrate two points.
- 09:53First of all,
- 09:54the gene that it localizes
- 09:56to CRH R1 corticotropin,
- 09:57releasing hormone receptor one.
- 10:00Is Gene that participates in a
- 10:02pathway that's long been known to
- 10:04be important in stress response
- 10:06steroid signaling cortisol signaling?
- 10:08In fact,
- 10:08it was such a strong candidate that it
- 10:11was studied in prior candidate gene
- 10:14studies picked out based on its biology.
- 10:17The other thing I want to point
- 10:19out is the importance of having
- 10:21populations other than European
- 10:23ancestry populations because absent the
- 10:26contribution from African ancestry,
- 10:27then any of the genes in this region
- 10:30could really be considered to have
- 10:32equal statistical support and
- 10:33some of them are also interesting
- 10:36biological candidates,
- 10:37and this was the cover that we
- 10:40proposed for the Journal that
- 10:42was turned down.
- 10:43But it's a guy re experiencing
- 10:46the regional Association plot.
- 10:49Our follow up paper considered all
- 10:52three of the PTSD sub phenotypes,
- 10:55avoidance re experiencing and
- 10:58hyperarousal and these were analysis
- 11:01that were done mostly by Dan Leavy in my lab.
- 11:06This came out in nature genetics
- 11:08I think in January.
- 11:10So this study included many more
- 11:13subjects were up to over 250,000,
- 11:16and we considered both case control
- 11:18models and quantitative trait models,
- 11:21and we also looked at the three
- 11:24sub phenotypes separately.
- 11:25So if we look at total PSL score,
- 11:29that's the overall quantitative
- 11:31measure used to diagnose PTSD.
- 11:33We now 15 independent associated
- 11:36regions much larger sample.
- 11:38And in this very nice graph,
- 11:40which was made by Gita Pathak
- 11:43and Renato Paula Montes Lab,
- 11:45we can see the localization
- 11:47of risk loci by trade
- 11:49re experiencing avoidance and hyperarousal
- 11:51or PCL total or EHR define case control.
- 11:55You can see that we do have some
- 11:58genome wide significant findings
- 12:00for African Americans now that
- 12:02they tend not to localize with the
- 12:06genome wide significant findings.
- 12:08For European American subjects and
- 12:11that the different sub phenotypes have
- 12:14genomic regions, some of which coincide,
- 12:17and some of which are specific
- 12:20to a specific of sub phenotype.
- 12:26So now we can start to get some
- 12:29interesting biology out of that.
- 12:32This graph shows correlation
- 12:34phenotypic Lee between case control,
- 12:36total PCL and sub phenotypes and
- 12:39below genetic correlation or RG.
- 12:41And what I want to point out is that
- 12:44genetic genetic correlation between
- 12:46each of the three sub phenotype groups
- 12:50taken individually is very high.
- 12:53In each case it's over .9.
- 12:56So the implication is that although
- 12:59there are phenomenologically different
- 13:01that the clinical insight that they
- 13:03formed part of the same diagnostic
- 13:06construct was absolutely correct,
- 13:08because genetically they
- 13:10share similar orange origins.
- 13:13So this allows us to use.
- 13:16Genome wide genetic data to reflect
- 13:19back on diagnosis and nosology.
- 13:24So now we have large geosat PTSD.
- 13:28We've discovered more risk loci,
- 13:30genomic structural equation equation
- 13:33modeling also supported the.
- 13:36A single common factor underlying the
- 13:39three phenomenologically different PTSD
- 13:42sub phenotype factors validating the
- 13:45biological coherence of the PTSD syndrome.
- 13:48The other trade I'm going to talk
- 13:51about is alcohol use or alcohol use
- 13:54disorder risk genes that map to
- 13:56alcohol metabolising pathways have
- 13:58been well known for decades and are
- 14:01very easy to understand biologically.
- 14:03If you have a variant that interferes with
- 14:06the metabolism of ethanol that results
- 14:09in high rates of circulating acetaldehyde.
- 14:12The first metabolite on the most
- 14:14important metabolic pathway,
- 14:15which results in dysphoric properties,
- 14:18you're relatively protected from.
- 14:20Alcohol use disorder.
- 14:21The first PG CG was for alcohol use
- 14:25disorder which came out in late 2018,
- 14:29found despite a good sample size with
- 14:3246,000 European ancestry subjects only.
- 14:35This one risk locus and European
- 14:38ancestry subjects.
- 14:39Nevertheless, it made the cover of nature.
- 14:42Neuroscience,
- 14:43name of this cover diagram is plot glasses.
- 14:49Our follow-up study from the
- 14:52MVP was again much larger,
- 14:54and we considered both alcohol
- 14:56use disorder perception an audit,
- 14:59see which is a quantity frequency
- 15:01measure of alcohol use without capturing
- 15:05biological physiological dependence.
- 15:07So previously these were thought
- 15:09to index closely related traits,
- 15:11so if you drink a lot of alcohol,
- 15:13we might have assumed that the
- 15:16genetic background was similar to
- 15:18alcohol use disorder perception.
- 15:20Here's the Manhattan plot for audit.
- 15:22See quantity frequency measure.
- 15:26These analysis were led by Hong Zhao,
- 15:28an associate research scientist.
- 15:30Now in my lab.
- 15:31This is the meta analysis across population
- 15:35groups for alcohol use disorder.
- 15:37I won't spend a lot of time on those.
- 15:41What I want to spend some time
- 15:43on is the genetic correlation
- 15:44patterns by linkage disequilibrium
- 15:47score regression with audit.
- 15:49See quantity frequency measure.
- 15:50On the one hand and alcohol
- 15:53use disorder on the other.
- 15:54And what we can see is that there
- 15:57are patterns of genetic correlation
- 15:59are very different in important ways.
- 16:02So if we look at anthropometric
- 16:04measures we see many significant
- 16:05negative correlations with audit C.
- 16:08So audit, see higher audit C,
- 16:11so drinking more frequently and
- 16:13in higher quantity is associated
- 16:16to lower body mass index.
- 16:18Lower obesity,
- 16:19apparent decrease risk for
- 16:20coronary artery disease.
- 16:22This is not seen for alcohol use disorder.
- 16:27But if we look at psychiatric traits audit,
- 16:30see appears to be in a minor way,
- 16:33protected against major depression.
- 16:35But IU D is associated with
- 16:38risk for smoking neuroticism,
- 16:40bipolar disorder,
- 16:41schizophrenia,
- 16:42down the gamut of psychiatric traits,
- 16:45for which we have sufficient information.
- 16:49But once we have genome wide
- 16:52Association data,
- 16:52we can do all sorts of post GWAS
- 16:55insilico analysis to learn more
- 16:57about the biology of the disorder.
- 17:00I've focused only on plea atropia analysis,
- 17:02but there are a range of things you can
- 17:05do to learn about the biology of the trait.
- 17:09Once you have a genome wide Association
- 17:12analysis with good enough power.
- 17:14So finally,
- 17:15for gene identification studies like juos,
- 17:18larger samples are much better,
- 17:20provided that careful
- 17:21attention is paid to phenotype,
- 17:23the amount of alcohol consumed
- 17:25turns out to be a different genetic
- 17:28trait from alcohol dependence,
- 17:30most biobank samples so far
- 17:32are mostly European ancestry.
- 17:34This is a problem not just in
- 17:37terms of fairness and equity,
- 17:39but in terms of science differences
- 17:42and linkage disequilibrium
- 17:43structure by population.
- 17:44Are very important in helping us
- 17:47improve fine mapping and therefore
- 17:49identify what variance might
- 17:51really be associated to disease,
- 17:54and even sometimes what genes are associated.
- 17:58And finally,
- 17:59the MVP million vets program comparatively
- 18:01ill comparatively low SES presently
- 18:03uniquely valuable gene mapping resource.
- 18:05And with that I want to acknowledge funding
- 18:09from the VA an from not in an I AAA.
- 18:13My collaborators and the PTSD study.
- 18:17And collaborators in the Alcohol
- 18:19use study and thank you very
- 18:21much for your attention.
- 18:22I'd be happy to take questions by email.