Using Large-Scale Clinical Data for Discovery In Multiple Sclerosis And Epilepsy
April 30, 2021ID6551
To CiteDCA Citation Guide
- 00:00It's my pleasure to now introduce our
- 00:03next speaker Doctor Chris Christie,
- 00:05office Doctor Kostakis,
- 00:07graduated from the Imperial College
- 00:09London and earned his PhD at the
- 00:12University of New South Wales in Sydney.
- 00:15After postdoctoral training at
- 00:16the Broad Institute and MTH,
- 00:18where he undertook some of the
- 00:21first Genome wide Association
- 00:22studies in autoimmune disease,
- 00:24he joined Yale in 2010.
- 00:26His laboratory uses genetics,
- 00:28genomics and epidemiological
- 00:29approaches to identify the
- 00:31biology underlying autoimmune.
- 00:32And neurological diseases
- 00:33dersum floor is yours,
- 00:35Doctor Kostakis.
- 00:39Thank you Nicole.
- 00:40I'm afraid I can't start my video
- 00:42so it's the tech people would like
- 00:44to start that, but it's fine.
- 00:48There we go. Hello everyone,
- 00:49I also have no disclosures and I
- 00:51would like to take the next few
- 00:53minutes to tell you a little bit
- 00:55about some of the things we've
- 00:57been thinking about in my lab.
- 01:02Specifically, how to use large
- 01:04scale data of both clinical and
- 01:07genetic data to make discoveries in
- 01:10diseases that we're interested in,
- 01:12specifically multiple sclerosis and epilepsy.
- 01:16And I chose these two projects
- 01:18to talk about because they are
- 01:20quite early on in their inception,
- 01:22so we're still not 100% sure
- 01:25what the story is,
- 01:26but I think it is instructive to
- 01:29look at what we can do with data.
- 01:33So the first story is multiple sclerosis.
- 01:36This is a large scale project
- 01:39out of the from the EU,
- 01:41primarily funded by one of
- 01:44the EU horizon programs.
- 01:45It's led by colleagues at the
- 01:48Carolyn Skirt Institute in Sweden,
- 01:50and it covers 10 countries,
- 01:52including site at Yale and a site at UCSF,
- 01:56and all the other partners are
- 02:00in in Europe and our main.
- 02:03Or our main focus is on multiple sclerosis,
- 02:08which is a predominantly autoimmune
- 02:10disease of the brain where the immune
- 02:14system basically decides that it does not
- 02:17like the myelin sheath around white matter.
- 02:21In neurons you get
- 02:23infiltration of immune cells.
- 02:25Stereotypically T cells that
- 02:27cause myelin stripping around.
- 02:30Blood vessels in the brain and
- 02:32you get these large lesions in
- 02:35the brain and progressively get
- 02:37more and more lesions overtime.
- 02:39And that leads to a relapsing remitting,
- 02:42usually mode of disease,
- 02:43where there is both physical and
- 02:45cognitive decline and eventually
- 02:47this becomes permanent and
- 02:49patients experience ongoing and
- 02:51progressive disability.
- 02:52It is a lifelong disease.
- 02:54There are disease modifying therapies,
- 02:56but there exists no cure and
- 02:58it is one of the more common
- 03:02neurological diseases out there.
- 03:04And like most such diseases,
- 03:06you can find some families where the
- 03:09disease appears to run in the families,
- 03:12but most cases are sporadic.
- 03:14It looks like familiar landmass
- 03:16is not a single gene form of Ms.
- 03:19It is exactly like the sporadic
- 03:21form it is polygenic.
- 03:23It is extremely complex.
- 03:24We have at least 200 loci mapped from large
- 03:28scale genome wide Association studies.
- 03:31We estimate there's probably another
- 03:33800 to 1000 out there in the genome,
- 03:36and a large effort now across the
- 03:39world is has been initiated to try
- 03:42and figure out what those genes do.
- 03:45But also to see how we can use
- 03:47some of this information and one
- 03:50of the problems has been that
- 03:52this disease is quite common,
- 03:54but it's not type 2 diabetes comma,
- 03:57so it's about one in 1000
- 03:59in European population,
- 04:00so it's fairly common,
- 04:01but no one really has a cohort of
- 04:0420 or 30,000 patients who have all
- 04:07been seen for a very long time in one
- 04:10clinic where data have been collected.
- 04:13In the same way,
- 04:15by the same people.
- 04:16And So what you have to do is Unite
- 04:19data across many centers.
- 04:22Often with differing practices
- 04:24with differing CHRS,
- 04:25or before that just paper records
- 04:29and try and put these data
- 04:32together in some meaningful way.
- 04:36So you can make large scale inferences
- 04:38and this goes back to what IRA
- 04:41said initially about how even as
- 04:43a biobank we need to be one of
- 04:45the network of biomax and this is
- 04:48very much what we've been trying
- 04:50to do in a disease focused way,
- 04:52and so this project has been
- 04:54aiming to do exactly that,
- 04:56and then from these large scale data,
- 04:58try and see if there are subsets
- 05:00of patients who seem to respond
- 05:02differently to therapy who seem to have
- 05:05different outcomes that is predictable.
- 05:07And that might maybe mechanistic because
- 05:10the problem is like most complex diseases,
- 05:13Ms is extremely heterogeneous at diagnosis.
- 05:16There is effectively no prognosis that
- 05:18one can give to a patient 'cause they may
- 05:22be severely disabled within five years,
- 05:25or they may be just fine 20
- 05:28years down the line,
- 05:30it's very hard to tell anything
- 05:32to tell a patient anything,
- 05:35and that is a major issue.
- 05:38And So what we've been doing is we have
- 05:41been warehousing both clinical and
- 05:43genetic data across these collections.
- 05:46And what I'm showing you so far
- 05:49is the progress we've done.
- 05:52We've made with the genetic data,
- 05:54which is about 45,000 Ms patients across
- 05:5810 centers and 26,000 controls to date.
- 06:01And this by itself has been
- 06:03a fairly major nightmare,
- 06:05not least of which has been the paperwork
- 06:08becauses the GDP are the privacy law
- 06:10that is come into effect in Europe,
- 06:13has really done a number of this on
- 06:16this and we've had like a major.
- 06:19It took us a year and a half to unwind
- 06:23the legal implications of that,
- 06:26but these are real issues that
- 06:28will have to be faced when we think
- 06:32about federations of biobanks,
- 06:34or of case control cohorts across places,
- 06:37and we're also trying to Unite clinical data.
- 06:41We have about 60,000 patients worth of
- 06:44clinical data with different amounts
- 06:46of data for different patients.
- 06:49And we are still trying to resolve those,
- 06:51and ultimately what we want to be
- 06:54able to do is to build predictors
- 06:56of outcomes which we have captured
- 06:58in the clinical data using both.
- 07:01Other data that we have on the data
- 07:03on the patients and the genetic data.
- 07:07Just the genetic data,
- 07:09which is a fairly standard platform.
- 07:13Gina type this.
- 07:14The vast majority of this is
- 07:16genotyping relevant sequencing.
- 07:17There are different platforms
- 07:19on which one can genotype,
- 07:21but they're fairly standard.
- 07:22It's a fairly homogeneous data type.
- 07:24It has taken us about a year to put
- 07:27these data together because there
- 07:29is a pretty significant amount of
- 07:31work involved in actually Q seeing
- 07:34and processing data,
- 07:35and so just that has been
- 07:37a nontrivial challenge.
- 07:39We have now overcome this.
- 07:40We now have this unified collection.
- 07:43Unlike most case control cohorts where
- 07:45we do genome wide Association studies,
- 07:47we actually have deeper information
- 07:49rather than just whether someone
- 07:51is a case or a control,
- 07:52and we're now trying to put these
- 07:55data together so this the next couple
- 07:57of years I think are going to be
- 08:00very exciting here as we try and
- 08:02figure out if there are predictors
- 08:04for both outcomes and treatment
- 08:07outcomes in treatment
- 08:08responses in these patients.
- 08:10What we have so far in the clinical data,
- 08:13I will show you there very briefly.
- 08:16These are all sorts of lifestyle and clinical
- 08:19data that seem to segregate patients.
- 08:21This is a principle components analysis
- 08:23of our entire phenotype matrix,
- 08:25and you can see that there are.
- 08:30Phenotypes seem to correlate with age.
- 08:32In the top left you can see that the
- 08:36dominant trend in our patients is
- 08:38actually age and that kind of makes sense.
- 08:42It's a progressive disease.
- 08:43It's a lifelong disease.
- 08:45Older individuals tend to have more symptoms,
- 08:48and you can definitely see things like that,
- 08:51but that's an important confounder as well.
- 08:54Age is an important aspect of disease
- 08:57that we often don't talk about.
- 09:00We see more interesting things
- 09:02if you look at that second panel
- 09:05from from the left on the top.
- 09:08There's a correlation with
- 09:10natural UV exposure,
- 09:11'cause it turns out the vitamin D is actually
- 09:13an important component of Ms Pathology.
- 09:16It is a risk factor.
- 09:17It appears to be causal in ways
- 09:19that we don't really understand.
- 09:22But there are lifestyle exposures
- 09:23like that as well and they are
- 09:26definitely coming out of the the data.
- 09:28We also see smoking behaviors.
- 09:30Gender is an important component and so on.
- 09:32So as we start pulling all of
- 09:34these clinical data together,
- 09:36we were getting patterns even
- 09:37from very simple.
- 09:40Views of data. This is like a very
- 09:44naive exploratory way to look at data,
- 09:46but we're seeing patterns even in that way,
- 09:49just for the remainder of the time.
- 09:52I'd like to switch for a second and tell
- 09:55you about a different project that is
- 09:58still quite similar in flavor I think,
- 10:01which is about epilepsy.
- 10:03Identifying predictors of
- 10:04psychiatric disease, and epilepsy.
- 10:05This is funded by the NINDS and it
- 10:08is a collaboration between Yale,
- 10:10Arhus University,
- 10:11Helsinki University and the.
- 10:13Rodents chew epilepsy is a.
- 10:18Basically it disease where
- 10:20of seizures in the brain.
- 10:22It is abnormal electrical activity.
- 10:26That is often repeated.
- 10:27You see 2 broad types of seizures on EG.
- 10:31You see either a generalized seizure
- 10:34pattern that takes up a large portion
- 10:37of a hemisphere or the entire brain,
- 10:40or you see very focal abnormal electrical
- 10:44activity in one area of the brain.
- 10:48Again,
- 10:49it is a common neurological disease
- 10:52about one and 26 people in the
- 10:55US have a diagnosis of epilepsy.
- 10:58It is a complex disease.
- 11:00There exist certain single gene forms
- 11:03of it that explain about 14% of cases,
- 11:07but the other 8586% is this more common
- 11:11complex form again polygenic many genes.
- 11:14Heritable,
- 11:14but not simply heritable.
- 11:19And what we've been doing has
- 11:21been working with some colleagues
- 11:23at Arhus University in Denmark.
- 11:26Like many of the Nordic countries,
- 11:28Denmark has an integrated.
- 11:31Our health care system for which records
- 11:35are completely available for research,
- 11:37so the population of Denmark
- 11:40is about 5,000,000 people.
- 11:42There are records roughly for
- 11:44about 2,000,000 people who have
- 11:46interactions with the hospital system.
- 11:48We we tend to limit this to people
- 11:52who've had interactions recently.
- 11:55By which I mean after 1981.
- 11:59Becauses people born after 1981
- 12:01also have blood spots stored
- 12:04in the Staten Serum Institute
- 12:06from which we can extract DNA.
- 12:09So you can do population level
- 12:11genetics based on the hospital
- 12:13registers across the entire population.
- 12:16And so we limited this to this
- 12:19and one of the things that we
- 12:23observed about four years ago now.
- 12:26Is that if you look at individuals
- 12:30with a diagnosis of epilepsy,
- 12:34you find a strong overrepresentation
- 12:38of mental illness diagnosis.
- 12:41In that population.
- 12:42So if you look right at the top,
- 12:45there's about 1.3 million people who
- 12:47do not have a diagnosis of epilepsy,
- 12:50and there are reference.
- 12:51And there's about 10 and a half
- 12:54thousand people who do have a
- 12:56diagnosis of epilepsy and they have
- 12:59somewhere between 1.4 and 1.6 fold.
- 13:01Higher rates of psychiatric
- 13:03illness diagnosis.
- 13:03These are all diagnosis
- 13:05from hospital registers.
- 13:06They're not necessarily strong,
- 13:08strongly followed by individual physician,
- 13:10so this is not a cohort, these are.
- 13:13Medical records and that's worth.
- 13:16Highlighting, however.
- 13:19Psychiatric illness is itself genetic.
- 13:21Again, it is complex.
- 13:23There have been many.
- 13:25Genetic studies of that and so is epilepsy.
- 13:28And So what we are trying to
- 13:30figure out is if we can ask.
- 13:33Does the epilepsy cause psychiatric illness,
- 13:36or are these both either independent
- 13:39effects or both effects of a
- 13:43shared underlying pathology?
- 13:45That's an interesting question,
- 13:47because we think we can then develop.
- 13:49Predictors for given.
- 13:54That you have a diagnosis of epilepsy.
- 13:56What is the probability that
- 13:59you actually develop?
- 14:00Psychiatric illness post epilepsy.
- 14:02It's not even we can see in the data
- 14:07that not everyone is at equal risk,
- 14:10but we do not yet understand
- 14:12who is a higher risk,
- 14:14and so we're taking everything
- 14:16from school records which exist
- 14:18in separate registers which can be
- 14:20cross referenced to the hospital
- 14:23registers to genetic profiles.
- 14:25Two prescription reimbursements to see
- 14:28whether people have refractory disease,
- 14:31or whether they've cycled through
- 14:34many antiepileptic medications.
- 14:37And we're trying to build these predictors,
- 14:39becauses it seems rather important
- 14:41to know who is at substantial
- 14:43additional risk given a diagnosis
- 14:46of epilepsy relative to the others.
- 14:48I will just finish with a small
- 14:51little vignettes of almost accidental
- 14:53findings that we've seen again looking
- 14:55in these registers because we have
- 14:58an entire population to look at.
- 15:02We observe that. People with epilepsy
- 15:07are more likely to have a mother with
- 15:11epilepsy than a father with epilepsy.
- 15:15For reasons that we do not understand,
- 15:18there exists this maternal effect.
- 15:19If you just see in red,
- 15:21there's about a 1 1/2 fold.
- 15:24Enrichment of.
- 15:26Maternal epilepsy than paternal epilepsy.
- 15:29We're not sure if this is genetic
- 15:31or if something else is going on.
- 15:33This is a very unexpected finding.
- 15:36It's been reported at least once before in.
- 15:40In much smaller cohorts,
- 15:41and there's been a long standing dispute
- 15:45in the field about whether this is true,
- 15:48and across about 1.75 million
- 15:50people in Denmark weekend.
- 15:52Unequivocally see there is a sort of
- 15:55fairly meaningful increase in this risk and
- 15:57quite well underlies this maternal effect.
- 15:59We don't know, but it ties into our
- 16:02interest in *** bias in disease,
- 16:04and we're looking forward
- 16:06to following this up.
- 16:07So I want to just leave that there.
- 16:10Ask and I'll hand back to Nicole,
- 16:12who I think will introduce the
- 16:14next speaker or handle questions.