Saige Rutherford “Value-based Machine Learning: Optimizing for Utility Over Intelligence”

March 07, 2023

Information

ID: 9609
To Cite: DCA Citation Guide

Download Transcript

00:06Sorry, next speaker is sage
00:07Rutherford from Donders.
00:08She's going to talk about value based
00:11machine learning and she's got a little
00:13bit of negativity in her talk too.
00:19OK, can people hear me?
00:23Please.
00:27OK. Good morning.
00:28I'm excited to be here talking to you
00:31guys about a similar but slightly,
00:34you know, different topic.
00:35I'd take a little bit more of a
00:38philosophical big picture view.
00:42So today I'm going to
00:43introduce a couple things.
00:44I hope to go over some of the
00:47goals and values of our field.
00:49I want to talk about kind of the
00:51current state of the field and use my
00:52own journey in the field. As an example,
00:54I want to go over a few definitions.
00:57I'm using words that have a
00:58lot of different definitions,
00:59things like optimization,
01:01accuracy, and utility.
01:02So I want to make sure
01:03we're on the same page,
01:04then want to dive into what I mean
01:06by intelligence and accuracy.
01:07I want to talk about how we're measuring it,
01:09how we optimize for it,
01:11and what some of the limitations
01:12of this kind of.
01:13Framework that we've been focusing
01:14on are then dive into utility,
01:16talk about again how we measure it,
01:18how we might optimize for it and what
01:20I see is the benefits of thinking
01:22from this perspective and then talk
01:24a little bit about what's next.
01:26But I see some of the roadblocks,
01:28future directions and take home messages.
01:31So I'd like to begin with a warning.
01:33I already kind of mentioned this
01:34talk is going to be a little
01:36bit different than others.
01:37If you were reading it as a paper,
01:39it would probably be like an
01:40opinion piece or something.
01:41It's very philosophical,
01:42even though I come from like statistics
01:44and computer science the last year.
01:46So this is really been a thought
01:48experiment for me and I've really
01:50been thinking about things from
01:51a very philosophical perspective.
01:53The second warning is that it
01:55might come off as a little bit
01:57provocative and critical of the field.
01:59I hope to ground that in examples of my.
02:01Homework and I think as scientists
02:03we need to be critical of our own
02:05work so we will proceed with caution.
02:07The second note is I know that
02:08this is the Whistler Workshop on
02:10brain functional organization,
02:12connectivity and behavior.
02:13When I was thinking about all
02:15these ideas I was really thinking
02:16more from a machine learning,
02:18not necessarily the brain perspective.
02:20So when I use words like artificial
02:23intelligence, machine learning accuracy,
02:25you can actually just replace it with
02:27brain behavior predictive modeling.
02:29So you can think about using all of these.
02:31It's interchangeably.
02:32It's just a bit of a mouthful to
02:34keep saying your predictive modeling.
02:37OK.
02:37So to 1st talk about a little bit
02:39where we're at and the goals of our field.
02:42I think the goals are quite well defined.
02:45We have this you know maybe high dimensional,
02:47maybe low dimensional space,
02:48but we have axes representing
02:50things like biological measurements.
02:52This would include our neuroimaging data.
02:54We also have also have dimensions
02:56representing our behavioral tests,
02:58our self report measures,
03:00maybe environmental factors,
03:02lifestyle choices and our goal is
03:04really to learn functions that map.
03:07Between the space such that I think
03:09we hope to use one thing to predict another.
03:11Maybe we're using brain predict
03:13behavior or vice versa.
03:14So I think our goals have been for
03:17a few years fairly well defined.
03:19However, if you move to our values,
03:22I think maybe you know,
03:24we might think that we know them.
03:25But I think especially when it comes
03:27to prioritizing them and ranking
03:28what values are most important to us.
03:30This is where it's a bit more complicated.
03:32So these include things like validity,
03:34reliability, explain ability,
03:38fairness, accountability, usability,
03:41impact, and you might say, well,
03:44all of these things are important to me.
03:45But as you'll see throughout this talk,
03:47when you want to start optimizing
03:49for one of these.
03:50We really do need to rank them
03:51and have a priority and I think
03:53that ranking is a little bit less
03:54clear and it's really something
03:56we need to do to move us forward.
03:58So I hope to inspire you with
04:00that in this talk.
04:01I just want to cover a little
04:03bit of my journey in this field
04:04and how I got to this topic.
04:06I worked at the data scientist at the
04:08University of Michigan for five years.
04:09I was really working on a lot of the
04:11models Simon talked about like brain age,
04:13predicting cognition.
04:13I was here at Western 2018 giving
04:16a talk called the developmental
04:17mega sample where my message was
04:20really we just need more big data.
04:22If we combine all of these samples
04:23and work with this low dimensional
04:25brain basis set that will answer
04:26all of our questions and sent.
04:28You use working on that.
04:30I then learned that our models
04:31look a lot like this,
04:32what I call the spaghetti plot of the brain,
04:34and we're not really learning anything.
04:36We're not even predicting things that well.
04:37So there's a problem there.
04:39I came back to Whistler in 2020,
04:41was on the deep learning,
04:42bad thought, OK,
04:43deep learning is the answer maybe you know?
04:46But it turned out to really just
04:47be a more complicated method
04:48to answer the same question.
04:50I then moved to the other
04:51ones before I started my PhD,
04:53and I've really been working on individual
04:55differences and training large models,
04:57transferring,
04:57transferring them to clinical data.
04:59So that's these models started to sort
05:01of be better than spaghetti plots.
05:04There was a bit more careful
05:05modeling of uncertainty and things.
05:07But here I am in 2023 talking
05:10about value based machine learning
05:11and that's because there's still
05:12just I felt like something missing
05:14in the models that I was using.
05:16Maybe I'm learning a bit better of a model,
05:17but there's still something like
05:19missing that I care about in the
05:21work that is not being captured.
05:22And that's really over the last
05:24year I've been thinking about
05:26this value based consumer.
05:27And this is sort of represented
05:29in my journey.
05:29But just to make it a bit more general,
05:31I think if you embark on a brain
05:34behavior predictive modeling analysis,
05:35this is kind of the journey
05:37that you would take.
05:38You know, as a PhD student or data analyst,
05:41you begin by combining a bunch of
05:43different open datasets from all these
05:45amazing resources that we've shared.
05:47However, once you put them together,
05:49you realize there's not a
05:51lot of phenotypic overlap.
05:52Maybe if you're lucky you have age,
05:53sex, and cognition.
05:54You still go on to fit a bunch of different.
05:57Models ranging from simple things
05:59like linear regression to more
06:01complex things like deep learning.
06:03You don't realize that there's
06:04not a lot of signal in the data.
06:06You can barely predict H you know,
06:08maybe within three to five years over.
06:10You still are scientists and have to
06:12go on to publish your results anyways,
06:14taking kind of two paths.
06:15One might be being slightly optimistic,
06:18you know, slightly overselling
06:20the interpretation of potential,
06:21or being a bit more honest,
06:23like Simon sharing your honesty point.
06:25Maybe using MRI doesn't help that much,
06:28but however,
06:28you have trouble finding a journal
06:30that will publish this perspective.
06:32You then repeat or you leave.
06:33Um, for data science industry or field
06:36or machine learning can have more impact.
06:39You know, psychiatrists are fields,
06:40so I'm not giving up on it yet,
06:42but the cycle repeats again.
06:45If you're reading a brain behavior
06:46predictive modeling paper,
06:47you could probably any paper
06:49fill out this thing,
06:50go part and it would go something
06:53like this fluid intelligence.
06:55Clinical potential one day.
06:57Reference to the America All Nature paper,
07:00even though they don't do predictions,
07:01just unitary association.
07:06We need a bigger sample size.
07:09Using the HP AC or UK bio bank data.
07:14Correlation .28 between predicted and
07:17observed for reliability, grain age.
07:20No compound correction could be motion.
07:23These are, you know, Simon,
07:24you really illustrated this.
07:25And so over the last few slides that I've
07:27just wanted to point out that there's
07:29not like we're doing things wrong.
07:30It's just we've sort of been stuck at a
07:32field and we have this great question
07:34of wanting to relate brain to behavior,
07:36but we're sort of at the standstill.
07:38We're making these very tiny
07:39little improvements.
07:40And I think it's because we've
07:42really been focusing on the wrong
07:43thing and that's trying to get
07:44more and more accurate models.
07:46And then yes,
07:47of course that's important and useful,
07:48but it's not the only thing we
07:49care about and we need to consider.
07:51Other values that we have in this work.
07:53So that really led me into thinking about
07:55what I call value based machine learning.
07:57I think most of us probably
07:58know what machine learning is.
07:59So I'll just explain the value
08:00based part a little bit more.
08:02I discovered this when looking at
08:04different models of healthcare.
08:05So in the US and I think in
08:07other parts of the world,
08:08we were kind of implementing this model
08:10of healthcare that was called fee for
08:13service and that model just optimized
08:15to lower cost and this led to things
08:17like patients receiving worse care,
08:19doctors were spending less
08:20and less time being forced to
08:21see more patients in one day.
08:23They then were like this isn't what
08:25we want and move to this value
08:26based healthcare and that model
08:28you know considered cost because
08:29that's obviously a factor but they
08:31also optimized to improve patient
08:33outcomes and that became just a
08:35much better model of healthcare.
08:37So I've really looked to other
08:39fields for inspiration and I just
08:41want to go over a few other kind of
08:42paradigm shifts have been have been
08:44inspiring me and this to me I just
08:46mentioned the fee for healthcare
08:48service that moving to the value based
08:50healthcare also in psychiatry we've
08:51kind of not that we've completely.
08:53They entered the DSM,
08:54but we've moved from the DSM,
08:56which is categorical and binary,
08:58or it's working within the ER DOC,
09:00at least in research,
09:01which is more dimensional and continuous.
09:03If you think about definitions of health,
09:05we've moved away from the
09:07biomedical definition of health,
09:08which really just focused on lack of illness.
09:11Or if we didn't have any data on you,
09:13meaning you didn't go to the hospital
09:14or you didn't go to the doctor's office,
09:16that meant that you were healthy,
09:17which wasn't a great measure
09:19because especially for different
09:20demographic groups that, you know,
09:21didn't represent health well.
09:23So we moved away.
09:24And that towards the biopsychosocial,
09:25which again focused more on
09:27functioning like set section,
09:29realizing that that was a
09:31better decision definition help.
09:32Now in machine learning,
09:33we've really been stuck at this
09:35accuracy focus which has led to
09:36us this tunnel vision focus.
09:38And I'm going to argue that we
09:39need to shift towards utility,
09:41which is really just a much bigger holistic
09:43perspective of why are we doing this,
09:45what do we need to be doing differently.
09:48I guess why we need a paradigm shift,
09:51I think historically we've really like I
09:53mentioned in the current status slide,
09:55we've really overpromised solutions
09:57and underperformed and bringing
09:59these solutions into to reality
10:01even within the scientific field.
10:03And I think if we shift our priorities
10:05away from accuracy towards utility,
10:07it's going to allow us to make
10:09our goals more concrete and then
10:11you know we'll better communicate
10:13our results and where we're at.
10:15So I just want to move on
10:18to defining optimization.
10:19Most of you are probably familiar,
10:20but this is kind of the step when
10:22we're training machine learning,
10:23we're going to have some type of
10:25function and we're going to minimize
10:26that or maximize that function.
10:28Maybe that's the mean squared error or
10:30accuracy and this is kind of just the
10:32step where we tell our models what is
10:34the right and wrong direction to be heading.
10:37Within the machine learning lifecycle,
10:40optimization comes up at
10:41a few different points.
10:43And we kind of begin with defining
10:44our objectives.
10:45That's thinking about like what are our
10:46targets, what are we trying to predict.
10:48We then go through a phase of acquiring data,
10:50preparing it.
10:51We then move into training the model,
10:54testing the model.
10:55And maybe this is not so much in science,
10:57but if you're an industry,
10:58you move on to a phase of of monitoring it,
11:01deploying it, making it accessible to people.
11:04Now optimization comes up.
11:05Mostly we think about it.
11:07In this stage, you know,
11:08like when we have lost function,
11:10we're training and testing the model
11:11and maybe when we're deploying
11:13it to make sure it's at least
11:15working properly over time.
11:16But I think the optimization
11:17actually comes up most places
11:19throughout this life cycle,
11:20even though we don't think about it there.
11:22And I would argue that actually
11:24the most important place to think
11:25about it is that this initial phase
11:27when we're defining what we're
11:28doing and what our objectives are,
11:30and this is the place that we'll
11:32be able to input what our values
11:34are and what we care most about.
11:36So moving on to what accuracy is,
11:39it has a lot of, you know,
11:41obvious mathematical definitions
11:41that I really think about it in this
11:44quest for super high performance.
11:45And in this quest for super high performance,
11:47it has a very narrow objective of
11:49becoming more and more accurate
11:50and has a very immediate or short
11:52short term action plan for how
11:54to achieve this goal,
11:55and that's to minimizing the loss
11:58function on a particular set of data.
12:00If you think about utility,
12:02however,
12:02utility I think is more closely
12:04aligned with the model's purpose
12:06and that's answering the research
12:08question and adding
12:10more real world value.
12:11It's going to look at the bigger picture
12:13and make very creative adjustments
12:15to align with the ultimate research
12:17goal and real world application,
12:19and I hope to provide some
12:21concrete examples of this.
12:22So bringing all of these different
12:24things together, I really feel it's
12:26an opportunity to sort of pause,
12:28reframe our research questions
12:29and ask why we're doing this,
12:31where we want to be and how we can
12:33get there and and sort of create
12:36some criteria for for that journey.
12:38You know, in the process of setting
12:39up some of our optimization problems,
12:41I think we've all been so excited.
12:43Like this is so cool,
12:44we protect the brain from behavior,
12:45but we've convinced ourselves
12:46that really makes sense,
12:47just optimize for accuracy.
12:49But that's just because it's more
12:50easily mathematically formulated.
12:52The utility.
12:53And if you consider the bigger picture again,
12:56you think about you know our our
12:57goal isn't to become infinitely
12:59more accurate or intelligent.
13:01Really,
13:01our goal is to do useful things
13:03that makes life easier for humans,
13:04and that would be closely
13:07aligned with utility.
13:08OK,
13:08so I just want to dive a little bit more
13:10into accuracy and some of the limitations.
13:13But first I have to
13:14mention how we measure it.
13:15I think it's become clear,
13:16but it's really just a single
13:18metric that represents the models
13:19ability to predict some kind of
13:21observation and a test set of data.
13:23If you're working in a
13:24classification setting,
13:25this might actually be accuracy.
13:27If you're in a regression setting,
13:28this might be things like mean
13:30squared error or the correlation
13:32between predicted and observed.
13:34And this is an extreme simplification again
13:36of the model's performance and traits.
13:38And it really doesn't capture
13:39any other values.
13:40Things like reliability, validity,
13:43complexity, fairness, usability.
13:47We can also think of it again
13:50in this loss function setting.
13:51It's one to point out and kind of the
13:53same one metric that we're using in
13:55our training and testing to say like
13:57this has been not the perfect model,
13:58but then it's the same metric we
14:00use in this out of sample test set
14:03to determine the generalizability.
14:05We can also think of accuracy
14:07in terms of benchmarking.
14:09This is a little bit more abstract.
14:10It's when we go to compare one
14:12paper or one model, you know,
14:13my lab versus your lab,
14:15I have a better model than yours.
14:16And it's this, you know,
14:17comparison of 1 model to another.
14:19And we're searching for the very
14:21best model and that's determined by
14:23being the most accurate or having,
14:25you know,
14:26the best performance on this
14:28one single metric.
14:29And this kind of contributes to this.
14:31What I see is this very slow progress,
14:33just, you know,
14:34most models aren't even statistically
14:36significantly improving.
14:37Accuracy, it's like, you know,
14:39a 1% increase of things.
14:41So it's kind of led us to this.
14:42We feel like we're making progress,
14:44but it's really slow and we're
14:45kind of actually not
14:46true, really making any progress.
14:48We're just sort of inching along.
14:50So I think some of the limitations of
14:52accuracy have already become considered,
14:54but I'll just go over to them a little bit.
14:57I think there is not a definition of success.
15:00And if we don't have a definition of success,
15:02you know our goal isn't to become
15:04infinitely more accurate or intelligent
15:05and without this clear definition
15:07of our goals and vision of success,
15:09probably ever know when we reach it.
15:11What does it even mean to become
15:14infinitely more intelligent or accurate?
15:15What purpose would it serve to
15:17be in a world full of agents,
15:19machines and humans that are super
15:21intelligent and just to throw out
15:23good hearts law that says when a
15:24measure becomes a target it ceases
15:26to be a good measure measure?
15:30Sorry for those of you who don't know,
15:31headless.
15:32So it's one of my favorite TV
15:33shows and I just wanted to,
15:34I think you can understand this example
15:37because it's really just about soccer
15:39and this is just a very abstract
15:41definition of of limitations of accuracy.
15:43If you consider the goals of the soccer
15:45team that's playing in the Premier League,
15:46their long term goal is to really,
15:48you know,
15:48keep winning and have healthy players that
15:50are getting along well with each other.
15:52And if you consider one player that's
15:54like star player who's really just
15:56thinking about their abilities,
15:57their performance.
15:58That's Jamie Tart and from from Ted Lasso.
16:02It might be good in the short term,
16:04like they're going to win the game,
16:05but every other player on the
16:06team's going to hate them.
16:07It's going to kind of contribute to this
16:10bit toxic culture versus the team captain,
16:13right?
16:13Ken,
16:14who's not just considering their own needs
16:16and is a bit more broadly focused on.
16:18They're really better for the overall
16:20goal of the long term of of sustainability.
16:23And a few more lastly,
16:25concrete examples of why accuracy
16:27hasn't been working for us.
16:29We've obviously been chasing
16:31hierarchy high accuracy,
16:32but having high accuracy does not
16:34imply that we have reproducibility,
16:37that our models are meaningful,
16:38meaning that the feature we're
16:40using are better than random.
16:41It doesn't come with any explain ability,
16:44and having equal accuracy does not imply
16:46that two models learned in the same way.
16:52OK. So away from accuracy towards utility,
16:56which I think is where we need to be
16:58before I talk about measuring utility,
17:01it's really important to say that
17:03it's before you can measure utility,
17:05you really have to look at yourself
17:07and say what are our core values?
17:08And these are the values that I
17:10mentioned earlier and not only
17:12identify what the values are,
17:13but rank them in priority,
17:15what's the most important to you?
17:17And I also just want to point
17:18out that this is not going to be
17:20something that's consistent across
17:21all research questions, all models.
17:23It's very complex,
17:24dependent and even within your own
17:26work it might change depending on day,
17:29week or paper.
17:30So it's a very iterative process.
17:34For the purposes of measuring utility,
17:36we haven't really been doing
17:37it very much in our own field.
17:38So I sort of looked to a couple of other
17:41fields that happened during this work more.
17:43And the first field that I found
17:45was actually the ethical AI field.
17:47And I found this really wonderful paper
17:49that started to sort of assess what the
17:51values were in the field and they did this.
17:53This was the paper that won best paper
17:55at the leading ethical AI conference,
17:58and it's called the values encoded
18:00in machine learning research.
18:01And what they did this graph
18:02is a little bit hard to see.
18:04But the values of the front
18:06represent things like accuracy.
18:07And even though this paper was,
18:08that was evaluated in papers that had
18:11done with this ethical AI conference.
18:13So things like user rights,
18:15ethical principles,
18:15which are in red and purple
18:17are much lower on the chart.
18:19And so this is not really solving the
18:21problem, but was sort of just saying,
18:23you know, what are our values?
18:24Are they represented in our work?
18:25So that's kind of the place
18:27that you have to start.
18:29At the same conference they kind of
18:31another group took it a step further
18:33and said we need to go a little bit
18:35beyond just accepting like what the
18:37existing values are in the literature.
18:39We need to design A framework for how we can,
18:42you know,
18:42assess them consistently going forward.
18:44So they introduced a paper called
18:46towards the multi stakeholder
18:48value based assessment framework
18:50for algorithmic systems and I just
18:52want to briefly walk you through it
18:54because it has really inspired me.
18:56So you first have to start by kind
18:57of laying out what your values are
18:59and they decided to put this on the
19:02wheel because some of your values are
19:03going to conflict with each other.
19:05Now this might not all these values
19:07might not work for our field,
19:08but you could think about updating
19:10them for our field.
19:11So here they're primary value that
19:13they cared about was privacy and
19:16privacy is kind of in conflict
19:18with transparency maybe.
19:18So values that are on opposite
19:20sides of the wheel,
19:22you know you can't prioritize
19:23them at the same time.
19:24So first step is identifying.
19:26What's your top value?
19:27You then have to kind of go
19:29on to taking that value,
19:31identifying the criteria of
19:32that value and how it manifests.
19:34So in the example of privacy,
19:37some of the criteria might look like a
19:39data protection right to erase the data,
19:41and some of the more specific manifestations
19:44might be things like purpose,
19:46statement of data collection,
19:47statement of how long the data is kept.
19:49So clearly this is a bit messier
19:51of a process,
19:52that one simple formula for accuracy,
19:54but it's messy and necessary.
19:59After we kind of measured it and
20:00know what we're optimizing for,
20:02we can move on to actually optimizing it.
20:04I'm going to provide some concrete examples,
20:07again sort of from the ethical
20:09AI machine learning field,
20:11but I'll try to also bring it to our field.
20:13So in this example,
20:15fairness was a priority value.
20:16This is from a paper that was published,
20:18I think about two years ago in
20:21nature medicine, and they took.
20:24The MRI's and looked to predict
20:27the patient pain score instead
20:29of the radiologist diagnosis.
20:31And this is still a supervised
20:33learning problem.
20:34So they're from an image predicting a score,
20:37but the score that most people
20:39predict is this radiologist diagnosis,
20:41which is a standard measure of pain severity.
20:44But in the same data set they
20:46also had patient self reported
20:47pain and what does this simple,
20:49you know,
20:49switching up the target led to with them
20:52discovering that relative to the radiologist.
20:54Diagnosis,
20:54which only accounted for 9% of
20:57racial disparities and pain using
20:59the self reported pain accounted
21:01for 43% or almost five times
21:03more racial disparities and pain.
21:05So still the supervised learning problem,
21:07they're actually still optimizing for
21:08something like mean squared error,
21:10but in just switching this some
21:12kind of a creative way,
21:13they were able to more align
21:15with what their value was,
21:16in this case fairness.
21:19This is another example of fairness.
21:20I'll go through it quickly.
21:23They looked at equal opportunity
21:26and multi objective optimization.
21:28This is in the same fact conference
21:30of ethical AI and this was again
21:32a supervised learning framework
21:33and they wanted to not necessarily
21:35pay a trade off in fairness for
21:37accuracy that realized that both
21:39were really important.
21:40So they set up a really nice joint
21:43optimization problem where they
21:44could find the optimal solution that
21:46represented both values as a bit complicated.
21:49I have a lot more examples about
21:51this if you want to talk more later.
21:53The next last example I'm going
21:55to talk about is,
21:56other than fairness being a value
21:57that we want to optimize for,
21:59maybe things like efficiency and
22:01usability might be our priority value.
22:03I've been inspired by the human computer
22:07interface literature specifically.
22:08There's been a lot of work of learning to
22:10optimize for teamwork or recognizing that,
22:13and this kind of goes in the medical example.
22:15It's never going to be just AI alone.
22:17There's always going to be a
22:19person there that's communicating.
22:20And how can we optimize for
22:22that relationship? How can we?
22:24Not only get people to like
22:26doctors to use the model,
22:27but how can we introduce the
22:29information to them at the right time?
22:31So there was a couple of papers
22:33from Microsoft that really tackled
22:34this problem deeply and came up
22:36with some really nice solutions.
22:38And just to share one example of
22:40the brain predictive modeling field,
22:42this comes from my own work where I've
22:45really focused on not just like sharing code,
22:47which is a nice step,
22:48but how do I actually make these things
22:51accessible and to people without the
22:53same computational skills or resources.
22:56So my first two papers focused on sharing
22:58pre trained models of normative modeling.
23:01I then took a step further to kind of
23:04write a protocol of how to do it and
23:07even a step further designing a website.
23:10Where people can just upload CSV,
23:12click and drop things. So again,
23:14all of these examples are very creative.
23:16Not as straightforward solutions as accuracy,
23:19but they really get us closer
23:21to what we truly care about.
23:23Hopefully some of the benefits
23:25have come across already.
23:26But just to say this concretely,
23:28I think if we think more
23:30about optimizing for utility,
23:31it becomes more collaborative,
23:33efficient and there's just
23:35more well defined purpose.
23:37Is functional has ruled out the
23:39meaning rather than attractive,
23:41shallow and surface level
23:42appeal that intelligence has.
23:44It's really an opportunity to
23:46think deeply and align our models
23:48with what we really care about.
23:50And creative thinking and problem
23:52solving this recall is required.
23:54It's going to be more of a challenge,
23:56but that also means I think our
23:57solution will be much more satisfying.
23:59OK,
24:00there's obviously some roadblocks
24:01that are involved in this process.
24:03We have a lot of cognitive biases that
24:06might make us focus on much simpler
24:08problems as problem complexity increases,
24:10which tend to shift our responsibility
24:13and think along the lines of
24:14this is somebody else's problem.
24:16Maybe especially in academia,
24:17we think this is more of an industry problem.
24:19But actually in especially mental health,
24:22there's not that many people
24:23in industry doing this.
24:24It's it's mostly within academia.
24:26So I think it really,
24:28it is our responsibility.
24:29Obviously making the utility
24:31explicit is more difficult,
24:33but when you were scientists,
24:34we love a challenge,
24:35so I challenge us.
24:37The next roadblock is kind of where
24:39things happen in development versus
24:41where they're really deployed
24:42and when they become useful.
24:44So during development,
24:45we have often a single decision maker.
24:47This is often the person,
24:49the machine learning engineer or
24:51the PhD student training the model.
24:53We're often in a situation where
24:54the data is stationary because we're
24:56working with secondary data analysis.
24:58So it's kind of fixed.
24:59It is the data.
25:00We can't necessarily change it,
25:02but really in the real world,
25:04there's going to be many people involved.
25:05There's going to be.
25:07Simon mentioned maybe a,
25:08A you know medical doctor,
25:10a patient involved.
25:11All of these people will have different
25:12values and different value priority queues.
25:14So it's going to there's no way around it.
25:16It's going to be complicated and
25:18requires a lot of iteration and open
25:21discussion and and valuing diverse you
25:23know opinions and mapping them out.
25:26Also our data of course in the real
25:29world doesn't often incomplete
25:31and SCN is always evolving.
25:33And again another thing,
25:35I've kind of brought this up
25:37when I mentioned the multi
25:38objective optimization.
25:39Sometimes we think about our values
25:41as in conflict with another.
25:44Like if I focus on fairness,
25:45I'll pay off the price and accuracy.
25:47And then if I go to having, you know,
25:50more than just two values that I care about,
25:53it becomes infinitely harder and you
25:55can't optimize for everything at once.
25:56And I think this can be fixed with just
25:58really sitting down and for each problem
26:00we're trying to solve focusing on.
26:02What is most important to us?
26:03And again, this is not always
26:06going to be the same value.
26:08And done a few future directions
26:11in this work.
26:12I've been really inspired
26:13by a lot of other fields.
26:14I've been thinking about utility,
26:16defining it and optimizing for it already
26:18and we really need to learn from them.
26:20I showed you 2 examples
26:22from machine learning,
26:23the ethical AI papers and the
26:25human computer interaction papers.
26:27There's also behavioral economics which
26:29is really attached mapping utility
26:31functions and modeling them mathematically.
26:34So I think we can look to them for
26:36inspiration and then the value
26:37based Healthcare is really tackled.
26:39How to measure very complex outcomes
26:41so we can also be inspired by them?
26:45And again,
26:46everything that we're talking about
26:47with utility and our value priorities
26:50is going to depend on the context.
26:52And we really need to keep open
26:54communication and guidelines
26:55about making these decisions.
26:56I think the field has been
26:58really great about, you know,
26:59setting up like standards for
27:01reporting methods or, you know,
27:02recognizing these things.
27:03But we haven't really worked on it in
27:05this post analysis machine learning space.
27:07So I think we're ready for it.
27:10And just some take home messages,
27:12we've really had too much of a
27:14tunnel vision focus on the accuracy
27:16of our predictive models,
27:17which has made us lose track of why
27:19we're doing this and it's sort of
27:21created this lack of model utility.
27:23I think it needs to be a priority
27:25to define our values,
27:26which is going to build a better plan
27:28for moving towards these goals and values.
27:30And optimizing for utility
27:31is really an abstract,
27:32creative process that requires
27:34diverse perspectives and input,
27:36and it's going to be a very iterative,
27:37ongoing process.
27:39And with that,
27:40I just want to acknowledge my team and
27:42another one that's been supportive.
27:44Also the team I work with in Ann Arbor,
27:46MI and my dog Charlie Mott,
27:48who is really accompanied me on
27:49all of these thought experiments.
27:59Any questions for a sage? Right.
28:03I quite like the message. Give me a lot of.
28:10One of the maybe I missed something.
28:14I'm talking about. Applying and driving
28:16where there's lability is part of
28:19the the set apartment component of.
28:24Yeah. So the question was like.
28:29OK. Yeah, lots to think about with this very.
28:32Yeah as I mentioned the philosophical talk,
28:34but the question was I'm saying
28:36we need to optimize for utility.
28:38But then you brought up like explain
28:40ability and how that fits in.
28:41It's sort of utility I see as a as
28:44a manifestation of what our values
28:46are and that's context dependent.
28:48Explain ability might be that value might
28:51actually be utility or usefulness but I'm
28:54sort of using utility as this like you know,
28:57you can sort of think of it.
28:58Our values.
28:59So explain ability falls into it,
29:01but it might not always be explain ability.
29:05For example, a lot of times these models are
29:08are beautiful people they can provide us.
29:11Combine just ridiculous
29:14source provides insight.
29:16Doesn't really tell the plot moves
29:18that that my ideas would do.
29:20We actually learned learned about
29:22the system from the particular model.
29:24So it's going and sort of
29:25that that works and why.
29:27Yeah, I think explain ability is
29:29really important and also accuracy
29:31could be could fall within you know
29:34utility like having an accurate model
29:36is really useful if it's completely,
29:38you know always worse than chance,
29:40it's not that useful model.
29:41So I'm not saying that accuracy is wrong,
29:44but it's just it's only been focusing
29:45on that and there's a lot of other
29:47values that we have that I think we
29:49need to consider and maybe they need
29:51to turn the first like explain ability
29:53especially with the neuroscience might
29:55be one that you know takes priority.
29:58I think the comment on that,
30:00there's a there's a lot of that in
30:01medicine now like there's tons of
30:03you know they use drugs all the
30:04time on off label things and they
30:06don't really know why they work,
30:07but they work and the key
30:09thing is that they work right.
30:10And then in terms of if you put
30:13this in terms of radiologic.
30:16Thought I guess depending on the application,
30:19right, whether you have a screening
30:21application or a diagnostic application,
30:22you'll accept, you know,
30:23different rates of false positives
30:25or false negatives.
30:26So I think that it is essential
30:28to keep in mind what the task is,
30:31what goals are,
30:32and there's levels of explain
30:33ability, right, you know?
30:36Probably the mechanistic insight. Because.
30:40It's very important that you guys.
30:45Observation. Yeah.
30:50Yeah, there's countless
30:52examples of that not being true.
30:55Well, I mean do you think that?
30:58You know something?
31:00White matter tract alleviate depression.
31:02You know, maybe it works some of the time.
31:05So hey, it works. And if it doesn't
31:06work we'll try something else.
31:08They can stick a vagal nerve
31:10stimulator and stop epilepsy.
31:11Do they know how that works now?
31:12But it works.
31:15With leads to a much better.
31:18Yeah, sure.
31:19That's why we do the mechanism.
31:25Or whatever. I don't
31:26want to take over stages.
31:29I just want to say also I think like I
31:31think they're Ness has become a value.
31:32There's been a lot of you know,
31:34Todd's nature papers, a great example.
31:35Simon brought up a couple of ones.
31:37But is, you know, is it expandability?
31:40It's, you know, doesn't hold across a
31:42lot of different subgroups and things.
31:43We're not even measuring a
31:44lot of these constructs.
31:45Well, so yeah, I I don't have.
31:48I only have examples, not answers.
31:50But my purpose is really to say we
31:52need to be talking and thinking.
31:54About this before moving forward,
31:55Simon Ohh and yeah, repeat Simon's
31:57question or speak that we have.
32:03Are we actually at a stage where we
32:06can make compromises between accuracy
32:08and utility if our accuracy for most
32:12clinical diagnosis is less than 80% or
32:16for behavioral predictions like R = .28?
32:19I mean, if we give up anything
32:21from already close to nothing,
32:23we are left with nothing?
32:26What the increase of the questions that
32:28you know should we not focus on accuracy
32:30if we're already really bad at accuracy.
32:32Again I I sort of started thinking about
32:34this not only in the neuroscience like
32:36brain behavior predictive modeling field.
32:38I really was thinking about it as machine
32:40learning and whole edit at a hole in
32:42our our hyper focus on becoming super
32:44intelligent and a general intelligence.
32:46So I was thinking a bit broader than that
32:48and that's really where this is coming from.
32:50But again I think accuracy can still be
32:53a value, it's just not our only value.
32:57Well, it has, Breakspear pointed out.
32:59The bar is pretty low, so
33:03we can predict paying quite well.
33:05Or I mean, maybe arguably.
33:10Thank you. Much better, Jesse.
33:15See for example example. Explain.
33:21Umm. Would you recommend?
33:25Altering how you do it on the input side,
33:27like you know kind of features you use,
33:29how you segregate groups or whatever,
33:31or you advocate doing something
33:34on the implementation side,
33:36changing their algorithm.
33:38Like what do you think is the. I think
33:40there's not one answer to it.
33:42I think going back to that machine
33:44learning lifecycle, it really has to
33:46start with defining the objective.
33:48What are we predicting the right
33:50thing with the right features,
33:51with the right outcome.
33:52You know, and that radiologist example,
33:54all they did, they were still using
33:56the same input of knee MRI and all
33:58they did was change the target.
34:00So instead of predicting the radiologist
34:02diagnosis, they predicted pain,
34:03you know, self reported pain.
34:05But that's that's not something
34:07I can give an answer. Always.
34:09It's always going to be dependent
34:11on what's available.
34:12What are you prioritizing for?
34:14It's just it really requires
34:16careful thought about,
34:17you know what,
34:18what are you trying to change
34:19and what do you value us?