Simon Eickoff “Technical, conceptual and practical considerations on neuroimaging-based precision medicine”

Name: Simon Eickoff “Technical, conceptual and practical considerations on neuroimaging-based precision medicine”
Uploaded: 2023-03-07T20:27:04.8033333Z
Duration: 23 min 1 s

March 07, 2023

Information

ID: 9603
To Cite: DCA Citation Guide

Download Transcript

00:06Next try. Where were we?
00:10Ohh, we were at the topic of connectome
00:12based predictive predictive modeling
00:14and just to get an idea how many
00:17of you have used it or using it,
00:19just great hands.
00:20OK, so it's a very popular topic.
00:23That's great.
00:24So I don't really need to explain for why
00:28since it was really pioneered here by there.
00:31But yeah guys,
00:32it's been a matter of steam and speed,
00:36one of the most exciting fields that we have.
00:39OK. If you're the science,
00:40I mean it's amazing what you can do,
00:41right, starting from this paradigmatic
00:44brain age where I think most of the
00:48development is happening and we
00:51see fantastic results and we have
00:53something that robustly works with
00:55a lot of different living entities.
00:58And in fact what I always find most
01:01of trading is and I think from the
01:04medical perspective that's interesting.
01:06I mean with age you can kind of if
01:08you are an experienced radiologists.
01:10Or just somebody who looks at
01:11why you're just quite a bit.
01:13You can more or less tell by who is
01:15the old and who is the young person.
01:18But what about these two scales do you buy?
01:21Eye can distinguish what's
01:24different between these two people.
01:26Well, one is a male, one is a female brain,
01:28and you all, even the most experience of you,
01:32just have a guessing chance.
01:34Well that's not the case.
01:36Or image based prediction where you can.
01:40Quite easily and quite nicely
01:42assigned separately for agenda to the
01:45images with very good probabilities.
01:48And obviously we are in our all pushing
01:51towards clinical application behaviors,
01:53the subtyping of psychiatric patients,
01:57individual prediction of symptom loads
02:00and the big holy rail at the end that
02:04obviously pretty cheap even into the
02:07future for example therapeutic success so.
02:10It's no wonder that this
02:12is used by so many people.
02:14It just has an amazing capability
02:17and it's such an exciting field.
02:20Let me start with thrush,
02:23but a little bit of that already
02:25onto the picture.
02:26And we did this with a survey that was
02:30published last year on phenotype prediction.
02:33And if you look over the years,
02:35yes,
02:35it's a quickly exploded field and
02:38the sample sizes do grow quite a bit
02:42for the key accuracies actually start
02:45to grow up a bit over the years.
02:48In fact, they grew up quite a bit.
02:51Over the years,
02:52and that's particularly true if people
02:54use an external validation set.
02:57Ohh well, we can hopefully live with that.
03:01But what do you think about this?
03:03We just have to look at how various
03:06factors of the study design influence
03:09prediction accuracy.
03:10And it seems in fact, yeah,
03:13that does make more of a difference
03:15than you would really like, right.
03:18So there's a little bit of clouds
03:21coming up over the mountain.
03:24And now I come to the main form part
03:26of my talk, which is then the big.
03:30No install.
03:31No, well, not that bad,
03:33but I just want to throw out
03:36a few thoughts and I'm very
03:38happy to discuss them with you.
03:40Then after this presentation and
03:42during the rest of the conference
03:44and we start at the very bottom,
03:47what if anything is my data.
03:50So here we are looking at
03:52voxel based morphometry,
03:54something that has been around for
03:56about 20 years and there are I think
03:59about 1000 to 2000 EBM studies that.
04:02Already published and you would certainly
04:04guess it's a very well established
04:07solid standard technique, right?
04:09Well, we had a little bit of a
04:12closer look and just set up a couple
04:15of pipelines that differ whether
04:18you use stunt stripping using end
04:20or bench or they have cut version,
04:24what kind of segmentation you use,
04:26which template and registration
04:28mode and that gives us in the
04:30end this sort of roughly.
04:32Wasn't different pipeline versions
04:34and what I'm showing you here now
04:38is for each region in the brain.
04:40So these are the the chef up parcels,
04:44the average correlation between
04:47individual Gray matter volumes
04:50per subject across pipelines.
04:53So you see that for relations.
04:55For example here the bright low
04:58across different BM pipelines,
05:01individual Gray matter volumes
05:04correlate in the range of about .3.
05:09Even in the best regions,
05:12which are here sort of around the
05:14singlet and the medial temporal lobe,
05:16we reach something about .7
05:20correlation between pipelines.
05:22Same subjects and it's all just simple VM.
05:28No, that's a small change,
05:31but when in fact it may actually not
05:33even be that much of a small change.
05:36But how much of an effect does it have?
05:39Well?
05:41Small changes,
05:42even small changes can have
05:44rather big effects and that's
05:46illustrated here in a paper from
05:48Shaman that's just being published.
05:50We actually tried different
05:52processing pipeline,
05:53different prediction pipelines for brain
05:55age prediction and brain age really
05:58is about the easiest test case, right.
06:00We have a lot of subjects.
06:02It's a fairly non ambiguous target measure.
06:06So what we've seen in the last slide
06:09is that our data itself is rather.
06:12Baby, now to make things easy,
06:15we use the same data,
06:17so we're not having a problem
06:19from the last slide.
06:20And they're just different
06:22prediction pipelines,
06:23all standard good and validated.
06:26And that's what you see in terms
06:28of rain age prediction accuracy
06:31across different pipelines.
06:33And there's really just minor differences.
06:36It's not like we using completely
06:39different architectures.
06:39It's all of the standard stuff,
06:41but you from the same data.
06:43Can have a meat average error and
06:46consolidation from less than five
06:48years which is good to about six
06:51years which is not so good and easily
06:53above 7 years as well with other pipelines.
06:57And this is already in a setting
06:59where it's always the same data,
07:01the same subject and in particular it's
07:04very easy case because we're using a
07:08very non ambiguous target measure,
07:10but unfortunately most of the
07:12cases that we are really
07:14interested in. They are not so easy.
07:18In particular, most of the philosophical
07:21targets we're looking at actually
07:24do not have such a fantastic both
07:28reliability and in some cases,
07:30objectivity and validity.
07:32And thanks to Martin sitting there,
07:35will explain a lot more detail in
07:37his post that night at the reception.
07:40We just made the whole thing a bit
07:44worse because now looking at the
07:47reliability of your target measure
07:49and the influence on accuracy.
07:51So remember your data is already
07:53problematic because it depends
07:55on the processing pipeline.
07:57Small changes in the prediction pipeline,
07:59even in a perfect setting,
08:00can introduce quite some
08:02differences in accuracy.
08:04And this is what happens if we go
08:06into The Dirty reality with not
08:09particularly reliable target measures.
08:12And we can see that from a
08:14pace from perfect reliability.
08:17So you see the accuracies for different
08:19training set sizes with perfect
08:20reliability and and they're not too bad,
08:23right?
08:23They're kind of in the range
08:25for what you see in many papers.
08:27And in fact, if the uh,
08:29reliability drops to about .5,
08:32then basically your accuracy
08:34goes away completely.
08:36And conversely,
08:38the less reliable your target measure is,
08:44the larger the training size you
08:47need for some useful accuracy.
08:50And if you put this together,
08:53then really anything that.
08:56It is in the typical range of self.
09:02Collective data is already
09:04rather problematic.
09:07OK, let's move on and say we need
09:10big data and in fact HCP was a
09:14good start but it's too small.
09:16But now we have UK Biobank,
09:19we now can see how things scale up
09:21and they should scale up well, right?
09:23And we hope, really hope it
09:25gets better with more data.
09:27That's what I meant.
09:29Looked at and we looked at the
09:33estimation of individual cognitive
09:34processing speed from resting state MRI.
09:37Something quite so effect this cognitive
09:40measures have fairly good reliability.
09:44And if you start to work
09:47with psychopathology skills,
09:49if you start to work with personality scores,
09:52you wish you would have stuck
09:54to commutative measures.
09:55OK now let's let's see what sort
09:57of all of these standardized and
10:00good resting state based prediction
10:03of processing speed from from Roy.
10:05What do we get?
10:07Well, we get a lot of jumping and
10:09jumping up to about 1000 subjects,
10:12which is kind of expected.
10:14And then and I think that's something
10:18quite relieving and I would say
10:21very reassuring that from about
10:231000 subjects we do see a monotonic
10:27increase in prediction accuracy and in
10:30fact also the order of the different.
10:34Models of the different ways of
10:37how we quantify resting state,
10:39they actually say quite honestly.
10:41So is that something that most
10:43of you would be happy with?
10:45Yeah, right. How about this?
10:48That's the prediction accuracy
10:49purely from the conference page
10:516 and intracerebral volume.
10:57Right, that's exactly the point.
10:59It's a big arch and in fact if
11:01you want to look at the preprint,
11:03we did this for a lot of different behaviors
11:07and you guess it is a very consistent
11:10pattern no matter what we look at.
11:12In fact those of you have
11:14been around for a bit.
11:16Do you remember the complete the 88th St.
11:19competition at OHM where the top prediction.
11:24Because the the the the depression.
11:28The competency was about 65% and
11:32the 2nd place the run up with 64%
11:35was only using the conference.
11:36So we're doing actually quite well with
11:40with the conference which is interesting.
11:43Now let's revisit some of the data
11:45I thought was very cool earlier on,
11:47which is about the, the, the sex prediction.
11:52Because remember there was
11:53this sort of proud say,
11:55we could do more than any radiologist.
11:57And in fact, yes,
11:58we do get a good classification rate.
12:00And the problem is if you look
12:02closer at the whole thing,
12:03you see that we actually just really
12:06classify on total brain volume to be honest.
12:09So again, this is something that is.
12:13Strongly driven by by confounds.
12:15Now many people could say,
12:16well in that case you could
12:19read just to confirm removal.
12:21Yes, we did a confront removal and what
12:24happens is that in this case we're
12:27not predicting until the brain volume
12:29anymore obviously because we removed it,
12:32but we're also not predicting
12:33very much else anymore.
12:34So prediction massively drops and
12:36then sort of a glimmer of hope if
12:39you start to work with things like
12:41matching and confront progression.
12:43You can get somewhat better,
12:46but you're always staying worse than
12:49what you can get with the conference.
12:52Is that specific to binary classifications?
12:55Well by now you should probably know,
12:57but I think this is a very
12:59illustrative case here.
13:00It's not brain based,
13:01it's just based on behavioral data.
13:04From there comma looking at hand grips,
13:07friends, sorry, it's from T1 and one.
13:10So if you predict handgrip strength
13:13from T1 MRI, we're getting quite a nice,
13:16pretty cheap if you.
13:20Go to sex specific models because you know,
13:22there's sort of this fundamental
13:24difference between men and female.
13:26Then yeah, that's still not bad.
13:28And regression of .27 you can probably
13:31live with now if you remove all
13:35the other confounds such as size,
13:38body weight and so on.
13:41Let's say the prediction becomes
13:43a little bit less important.
13:45In fact,
13:46there is nothing left anymore that you
13:49can predict when accounting for all
13:53these kind of anthropometric conforms.
13:59But. Is all of this is really
14:03surprising and we talked about
14:05this last night versus Thomas.
14:07There is some positive in that all
14:11of our behavior, all of our biology,
14:15all of us really live in a
14:18rather low dimensional state.
14:20We can and for those of you who dealt with
14:22the UK Biobank in some detail know it.
14:25We can get thousands,
14:27thousands and thousands of measures on.
14:30Everyone of you, right?
14:32All of these phenotypical medical
14:35anthropometric and so on details.
14:39But we as people don't vary in
14:43100,000 different dimensions,
14:45but rather into individual variability
14:48is rather low dimensional.
14:51And this has a positive in some
14:55way as familiar and in state,
14:57because then you can use things like
15:01transfer learning to actually learn,
15:03train the model on one behavior and
15:05then also Co predict other behaviors.
15:08That's good.
15:09But in another way,
15:10that means that we do need to
15:13rethink what we consider confirm.
15:16I think we are all.
15:18Most of us are from the youngest ones here.
15:21We've all grown up in the sort
15:24of more simplistic view.
15:26That's the thing I want to predict.
15:28And these are the two confirms
15:31and I remove them or adjust for
15:33them and life is good.
15:35Well,
15:36once you hit the stage
15:38where you have hundreds,
15:40thousands of information on each subject,
15:43then this old simple and convenient
15:46truths does not hold anymore.
15:49Now basically everything is a confound
15:52and most likely no feature has any
15:56information that goes beyond all of
15:58the confounds that are available.
16:02Why exactly?
16:03Because I want I said earlier.
16:06The dimensions are variations are rather few.
16:09So there is nothing that can be unlikely,
16:12nothing that can be specifically
16:14predicted by one particular feature
16:16that is not also captured by
16:19some combination of conference.
16:20So human variability is likely lower
16:23dimensional and hence I think we need
16:27to reconsider our ideas of confounds.
16:29And rather we now feel that confounds
16:32should be seen as a gradient
16:35starting from something that is very
16:38avoidable like get some sampling.
16:41All the patients are scanned on
16:43one scanner or the controls are
16:45scanned on the other scanner
16:47when that's something that's sort
16:49of very simple and avoidable.
16:51Then you have these things that
16:53are somewhat implausible but
16:55may have a biological link.
16:57So one of my favorites from the UK
16:59Biobank is that their GWAS hits or how
17:02often do you have baking breakfast?
17:05Yeah,
17:05it it it seems impossible to kind of laugh,
17:07but then you think about,
17:08well,
17:08maybe it has something to do
17:10with some sensitivity of saline
17:12receptors and and so there there
17:13could be some biological link.
17:15It's just quite implausible.
17:17Then there are things that are likely
17:21reflections of the same latent dimensions.
17:24A lot of cognitive scores,
17:25for example,
17:26are intercorrelated because it's like
17:29this one big factor of cognitive
17:32speed that really underlines virtually
17:34anything that has a reaction time.
17:37Then you probably have variables
17:39that are driven by a
17:42common factor and last but not least,
17:45things that are effectively
17:47measurement of the same biology.
17:49Now what is important is that this
17:53confound gradient is independent of the
17:56statistical strength of the compound.
17:59So this is what they are.
18:01Comma is that moment riding up
18:03and we hope we have the preprint
18:05out very soon is really to.
18:07Consider confounds as a 2D continuum,
18:11as the conceptual continuum from,
18:14as I said, bad sampling to you're
18:16in the fact in fact measuring
18:18more or less the same thing and
18:20the statistical continuum of how
18:23strongly something is associated.
18:25And just to complicate things a bit,
18:27more likely these things
18:28are even less like gender,
18:30for example,
18:31as part of this sort of super conference that
18:34influences a lot of other ones downstream.
18:37Now, why is that important?
18:40Why is it not just?
18:41Well,
18:42as long as we get good prediction accuracies,
18:44we should be happy.
18:46Well, it becomes quickly a problem
18:50if your conference structure
18:52differs between training and tests,
18:55or differs between different
18:57subpopulations of the test set.
18:59So that's work there,
19:03for example.
19:05Anything that shows that conviction
19:07accuracy systematically varies between
19:09African and white American participants
19:11from some of the larger US databases,
19:14and in fact it does service
19:17to offset and slope and so on.
19:20So basically in some way you
19:24could argue that these confound
19:26continuum 2D confirm continuum
19:28is a problem if you want to do
19:31some mechanistic cause insurance,
19:33but you wouldn't need to worry so much.
19:35So if it's just prediction accuracy and I
19:38hope this provides the counter argument.
19:41Because we cannot assume that the
19:44confound structure and cost structure
19:46is the same between training and test,
19:50in particular within subpopulations
19:52of a later applications.
19:54That this becomes a super big
19:57problem when it comes to bias,
19:59fairness and discrimination.
20:01And this then will quickly lead
20:05when it comes to application to the
20:08question of why and the we called.
20:12Show me the evidence problem
20:15and and here we have something.
20:16When we then particularly think about
20:19the medical application and we'll
20:21just briefly scratch that topic,
20:23we can talk about this more.
20:26For example,
20:27at the reception we have a
20:29particular problem when it comes
20:31to medical applications.
20:32The thing is that here we need
20:36to have an explanation towards.
20:41That Lady, not just the developer,
20:43it's not just the developer,
20:46it's not just the regulation bodies,
20:49it's not just the physicians.
20:51But in the end for medical application
20:54you need to be able to at least to some
20:59degree also convey your evidence to this,
21:02so to speak and customer.
21:04And they will be the same skeptic,
21:07right?
21:09Because.
21:10They don't really know what
21:12to do with this information.
21:13Now what's the difference between
21:16this very precise and very verifiable
21:19information you this year that the
21:22the physician saying give you higher
21:25powers and what you're complaining
21:27about no would recommend X.
21:30The key difference here is what
21:33is termed the connection into
21:35the web of beliefs.
21:37So this does not resonate with.
21:40Any experience, any feeling,
21:43any intersection that the patient has,
21:47it is an abstract information
21:49that has no connection with the
21:51existing web properties and hence
21:54it will not be seen as evidence.
21:57Whereas this, even if it's unreliable if
22:00it's maybe even more or less made-up.
22:03This resonates and this is something
22:06that will make the patient feel confident
22:08and that your treatment of soon.
22:11And then we have this huge gap there.
22:14And I think this is just something
22:16together with some of the other things
22:19that we built with friends that are more
22:22from the philosophy that I think are
22:25important to cut our enthusiasm a bit,
22:28just having a good prediction on it even if
22:32we could overcome all of the other problems.
22:34And then she will likely not result
22:37in something that is actually usable
22:41in practice because you do meet
22:44resistance that is not technical.
22:46And with that,
22:47I would like to close and hope I set
22:51the stage for the for the upcoming
22:53talks on connect on predictive modeling.
22:56Thank everybody who's been involved,
22:58in particular the groups by Saginaw and.