Skip to Main Content

Saige Rutherford “Value-based Machine Learning: Optimizing for Utility Over Intelligence”

March 07, 2023
ID
9609

Transcript

  • 00:06Sorry, next speaker is sage
  • 00:07Rutherford from Donders.
  • 00:08She's going to talk about value based
  • 00:11machine learning and she's got a little
  • 00:13bit of negativity in her talk too.
  • 00:19OK, can people hear me?
  • 00:23Please.
  • 00:27OK. Good morning.
  • 00:28I'm excited to be here talking to you
  • 00:31guys about a similar but slightly,
  • 00:34you know, different topic.
  • 00:35I'd take a little bit more of a
  • 00:38philosophical big picture view.
  • 00:42So today I'm going to
  • 00:43introduce a couple things.
  • 00:44I hope to go over some of the
  • 00:47goals and values of our field.
  • 00:49I want to talk about kind of the
  • 00:51current state of the field and use my
  • 00:52own journey in the field. As an example,
  • 00:54I want to go over a few definitions.
  • 00:57I'm using words that have a
  • 00:58lot of different definitions,
  • 00:59things like optimization,
  • 01:01accuracy, and utility.
  • 01:02So I want to make sure
  • 01:03we're on the same page,
  • 01:04then want to dive into what I mean
  • 01:06by intelligence and accuracy.
  • 01:07I want to talk about how we're measuring it,
  • 01:09how we optimize for it,
  • 01:11and what some of the limitations
  • 01:12of this kind of.
  • 01:13Framework that we've been focusing
  • 01:14on are then dive into utility,
  • 01:16talk about again how we measure it,
  • 01:18how we might optimize for it and what
  • 01:20I see is the benefits of thinking
  • 01:22from this perspective and then talk
  • 01:24a little bit about what's next.
  • 01:26But I see some of the roadblocks,
  • 01:28future directions and take home messages.
  • 01:31So I'd like to begin with a warning.
  • 01:33I already kind of mentioned this
  • 01:34talk is going to be a little
  • 01:36bit different than others.
  • 01:37If you were reading it as a paper,
  • 01:39it would probably be like an
  • 01:40opinion piece or something.
  • 01:41It's very philosophical,
  • 01:42even though I come from like statistics
  • 01:44and computer science the last year.
  • 01:46So this is really been a thought
  • 01:48experiment for me and I've really
  • 01:50been thinking about things from
  • 01:51a very philosophical perspective.
  • 01:53The second warning is that it
  • 01:55might come off as a little bit
  • 01:57provocative and critical of the field.
  • 01:59I hope to ground that in examples of my.
  • 02:01Homework and I think as scientists
  • 02:03we need to be critical of our own
  • 02:05work so we will proceed with caution.
  • 02:07The second note is I know that
  • 02:08this is the Whistler Workshop on
  • 02:10brain functional organization,
  • 02:12connectivity and behavior.
  • 02:13When I was thinking about all
  • 02:15these ideas I was really thinking
  • 02:16more from a machine learning,
  • 02:18not necessarily the brain perspective.
  • 02:20So when I use words like artificial
  • 02:23intelligence, machine learning accuracy,
  • 02:25you can actually just replace it with
  • 02:27brain behavior predictive modeling.
  • 02:29So you can think about using all of these.
  • 02:31It's interchangeably.
  • 02:32It's just a bit of a mouthful to
  • 02:34keep saying your predictive modeling.
  • 02:37OK.
  • 02:37So to 1st talk about a little bit
  • 02:39where we're at and the goals of our field.
  • 02:42I think the goals are quite well defined.
  • 02:45We have this you know maybe high dimensional,
  • 02:47maybe low dimensional space,
  • 02:48but we have axes representing
  • 02:50things like biological measurements.
  • 02:52This would include our neuroimaging data.
  • 02:54We also have also have dimensions
  • 02:56representing our behavioral tests,
  • 02:58our self report measures,
  • 03:00maybe environmental factors,
  • 03:02lifestyle choices and our goal is
  • 03:04really to learn functions that map.
  • 03:07Between the space such that I think
  • 03:09we hope to use one thing to predict another.
  • 03:11Maybe we're using brain predict
  • 03:13behavior or vice versa.
  • 03:14So I think our goals have been for
  • 03:17a few years fairly well defined.
  • 03:19However, if you move to our values,
  • 03:22I think maybe you know,
  • 03:24we might think that we know them.
  • 03:25But I think especially when it comes
  • 03:27to prioritizing them and ranking
  • 03:28what values are most important to us.
  • 03:30This is where it's a bit more complicated.
  • 03:32So these include things like validity,
  • 03:34reliability, explain ability,
  • 03:38fairness, accountability, usability,
  • 03:41impact, and you might say, well,
  • 03:44all of these things are important to me.
  • 03:45But as you'll see throughout this talk,
  • 03:47when you want to start optimizing
  • 03:49for one of these.
  • 03:50We really do need to rank them
  • 03:51and have a priority and I think
  • 03:53that ranking is a little bit less
  • 03:54clear and it's really something
  • 03:56we need to do to move us forward.
  • 03:58So I hope to inspire you with
  • 04:00that in this talk.
  • 04:01I just want to cover a little
  • 04:03bit of my journey in this field
  • 04:04and how I got to this topic.
  • 04:06I worked at the data scientist at the
  • 04:08University of Michigan for five years.
  • 04:09I was really working on a lot of the
  • 04:11models Simon talked about like brain age,
  • 04:13predicting cognition.
  • 04:13I was here at Western 2018 giving
  • 04:16a talk called the developmental
  • 04:17mega sample where my message was
  • 04:20really we just need more big data.
  • 04:22If we combine all of these samples
  • 04:23and work with this low dimensional
  • 04:25brain basis set that will answer
  • 04:26all of our questions and sent.
  • 04:28You use working on that.
  • 04:30I then learned that our models
  • 04:31look a lot like this,
  • 04:32what I call the spaghetti plot of the brain,
  • 04:34and we're not really learning anything.
  • 04:36We're not even predicting things that well.
  • 04:37So there's a problem there.
  • 04:39I came back to Whistler in 2020,
  • 04:41was on the deep learning,
  • 04:42bad thought, OK,
  • 04:43deep learning is the answer maybe you know?
  • 04:46But it turned out to really just
  • 04:47be a more complicated method
  • 04:48to answer the same question.
  • 04:50I then moved to the other
  • 04:51ones before I started my PhD,
  • 04:53and I've really been working on individual
  • 04:55differences and training large models,
  • 04:57transferring,
  • 04:57transferring them to clinical data.
  • 04:59So that's these models started to sort
  • 05:01of be better than spaghetti plots.
  • 05:04There was a bit more careful
  • 05:05modeling of uncertainty and things.
  • 05:07But here I am in 2023 talking
  • 05:10about value based machine learning
  • 05:11and that's because there's still
  • 05:12just I felt like something missing
  • 05:14in the models that I was using.
  • 05:16Maybe I'm learning a bit better of a model,
  • 05:17but there's still something like
  • 05:19missing that I care about in the
  • 05:21work that is not being captured.
  • 05:22And that's really over the last
  • 05:24year I've been thinking about
  • 05:26this value based consumer.
  • 05:27And this is sort of represented
  • 05:29in my journey.
  • 05:29But just to make it a bit more general,
  • 05:31I think if you embark on a brain
  • 05:34behavior predictive modeling analysis,
  • 05:35this is kind of the journey
  • 05:37that you would take.
  • 05:38You know, as a PhD student or data analyst,
  • 05:41you begin by combining a bunch of
  • 05:43different open datasets from all these
  • 05:45amazing resources that we've shared.
  • 05:47However, once you put them together,
  • 05:49you realize there's not a
  • 05:51lot of phenotypic overlap.
  • 05:52Maybe if you're lucky you have age,
  • 05:53sex, and cognition.
  • 05:54You still go on to fit a bunch of different.
  • 05:57Models ranging from simple things
  • 05:59like linear regression to more
  • 06:01complex things like deep learning.
  • 06:03You don't realize that there's
  • 06:04not a lot of signal in the data.
  • 06:06You can barely predict H you know,
  • 06:08maybe within three to five years over.
  • 06:10You still are scientists and have to
  • 06:12go on to publish your results anyways,
  • 06:14taking kind of two paths.
  • 06:15One might be being slightly optimistic,
  • 06:18you know, slightly overselling
  • 06:20the interpretation of potential,
  • 06:21or being a bit more honest,
  • 06:23like Simon sharing your honesty point.
  • 06:25Maybe using MRI doesn't help that much,
  • 06:28but however,
  • 06:28you have trouble finding a journal
  • 06:30that will publish this perspective.
  • 06:32You then repeat or you leave.
  • 06:33Um, for data science industry or field
  • 06:36or machine learning can have more impact.
  • 06:39You know, psychiatrists are fields,
  • 06:40so I'm not giving up on it yet,
  • 06:42but the cycle repeats again.
  • 06:45If you're reading a brain behavior
  • 06:46predictive modeling paper,
  • 06:47you could probably any paper
  • 06:49fill out this thing,
  • 06:50go part and it would go something
  • 06:53like this fluid intelligence.
  • 06:55Clinical potential one day.
  • 06:57Reference to the America All Nature paper,
  • 07:00even though they don't do predictions,
  • 07:01just unitary association.
  • 07:06We need a bigger sample size.
  • 07:09Using the HP AC or UK bio bank data.
  • 07:14Correlation .28 between predicted and
  • 07:17observed for reliability, grain age.
  • 07:20No compound correction could be motion.
  • 07:23These are, you know, Simon,
  • 07:24you really illustrated this.
  • 07:25And so over the last few slides that I've
  • 07:27just wanted to point out that there's
  • 07:29not like we're doing things wrong.
  • 07:30It's just we've sort of been stuck at a
  • 07:32field and we have this great question
  • 07:34of wanting to relate brain to behavior,
  • 07:36but we're sort of at the standstill.
  • 07:38We're making these very tiny
  • 07:39little improvements.
  • 07:40And I think it's because we've
  • 07:42really been focusing on the wrong
  • 07:43thing and that's trying to get
  • 07:44more and more accurate models.
  • 07:46And then yes,
  • 07:47of course that's important and useful,
  • 07:48but it's not the only thing we
  • 07:49care about and we need to consider.
  • 07:51Other values that we have in this work.
  • 07:53So that really led me into thinking about
  • 07:55what I call value based machine learning.
  • 07:57I think most of us probably
  • 07:58know what machine learning is.
  • 07:59So I'll just explain the value
  • 08:00based part a little bit more.
  • 08:02I discovered this when looking at
  • 08:04different models of healthcare.
  • 08:05So in the US and I think in
  • 08:07other parts of the world,
  • 08:08we were kind of implementing this model
  • 08:10of healthcare that was called fee for
  • 08:13service and that model just optimized
  • 08:15to lower cost and this led to things
  • 08:17like patients receiving worse care,
  • 08:19doctors were spending less
  • 08:20and less time being forced to
  • 08:21see more patients in one day.
  • 08:23They then were like this isn't what
  • 08:25we want and move to this value
  • 08:26based healthcare and that model
  • 08:28you know considered cost because
  • 08:29that's obviously a factor but they
  • 08:31also optimized to improve patient
  • 08:33outcomes and that became just a
  • 08:35much better model of healthcare.
  • 08:37So I've really looked to other
  • 08:39fields for inspiration and I just
  • 08:41want to go over a few other kind of
  • 08:42paradigm shifts have been have been
  • 08:44inspiring me and this to me I just
  • 08:46mentioned the fee for healthcare
  • 08:48service that moving to the value based
  • 08:50healthcare also in psychiatry we've
  • 08:51kind of not that we've completely.
  • 08:53They entered the DSM,
  • 08:54but we've moved from the DSM,
  • 08:56which is categorical and binary,
  • 08:58or it's working within the ER DOC,
  • 09:00at least in research,
  • 09:01which is more dimensional and continuous.
  • 09:03If you think about definitions of health,
  • 09:05we've moved away from the
  • 09:07biomedical definition of health,
  • 09:08which really just focused on lack of illness.
  • 09:11Or if we didn't have any data on you,
  • 09:13meaning you didn't go to the hospital
  • 09:14or you didn't go to the doctor's office,
  • 09:16that meant that you were healthy,
  • 09:17which wasn't a great measure
  • 09:19because especially for different
  • 09:20demographic groups that, you know,
  • 09:21didn't represent health well.
  • 09:23So we moved away.
  • 09:24And that towards the biopsychosocial,
  • 09:25which again focused more on
  • 09:27functioning like set section,
  • 09:29realizing that that was a
  • 09:31better decision definition help.
  • 09:32Now in machine learning,
  • 09:33we've really been stuck at this
  • 09:35accuracy focus which has led to
  • 09:36us this tunnel vision focus.
  • 09:38And I'm going to argue that we
  • 09:39need to shift towards utility,
  • 09:41which is really just a much bigger holistic
  • 09:43perspective of why are we doing this,
  • 09:45what do we need to be doing differently.
  • 09:48I guess why we need a paradigm shift,
  • 09:51I think historically we've really like I
  • 09:53mentioned in the current status slide,
  • 09:55we've really overpromised solutions
  • 09:57and underperformed and bringing
  • 09:59these solutions into to reality
  • 10:01even within the scientific field.
  • 10:03And I think if we shift our priorities
  • 10:05away from accuracy towards utility,
  • 10:07it's going to allow us to make
  • 10:09our goals more concrete and then
  • 10:11you know we'll better communicate
  • 10:13our results and where we're at.
  • 10:15So I just want to move on
  • 10:18to defining optimization.
  • 10:19Most of you are probably familiar,
  • 10:20but this is kind of the step when
  • 10:22we're training machine learning,
  • 10:23we're going to have some type of
  • 10:25function and we're going to minimize
  • 10:26that or maximize that function.
  • 10:28Maybe that's the mean squared error or
  • 10:30accuracy and this is kind of just the
  • 10:32step where we tell our models what is
  • 10:34the right and wrong direction to be heading.
  • 10:37Within the machine learning lifecycle,
  • 10:40optimization comes up at
  • 10:41a few different points.
  • 10:43And we kind of begin with defining
  • 10:44our objectives.
  • 10:45That's thinking about like what are our
  • 10:46targets, what are we trying to predict.
  • 10:48We then go through a phase of acquiring data,
  • 10:50preparing it.
  • 10:51We then move into training the model,
  • 10:54testing the model.
  • 10:55And maybe this is not so much in science,
  • 10:57but if you're an industry,
  • 10:58you move on to a phase of of monitoring it,
  • 11:01deploying it, making it accessible to people.
  • 11:04Now optimization comes up.
  • 11:05Mostly we think about it.
  • 11:07In this stage, you know,
  • 11:08like when we have lost function,
  • 11:10we're training and testing the model
  • 11:11and maybe when we're deploying
  • 11:13it to make sure it's at least
  • 11:15working properly over time.
  • 11:16But I think the optimization
  • 11:17actually comes up most places
  • 11:19throughout this life cycle,
  • 11:20even though we don't think about it there.
  • 11:22And I would argue that actually
  • 11:24the most important place to think
  • 11:25about it is that this initial phase
  • 11:27when we're defining what we're
  • 11:28doing and what our objectives are,
  • 11:30and this is the place that we'll
  • 11:32be able to input what our values
  • 11:34are and what we care most about.
  • 11:36So moving on to what accuracy is,
  • 11:39it has a lot of, you know,
  • 11:41obvious mathematical definitions
  • 11:41that I really think about it in this
  • 11:44quest for super high performance.
  • 11:45And in this quest for super high performance,
  • 11:47it has a very narrow objective of
  • 11:49becoming more and more accurate
  • 11:50and has a very immediate or short
  • 11:52short term action plan for how
  • 11:54to achieve this goal,
  • 11:55and that's to minimizing the loss
  • 11:58function on a particular set of data.
  • 12:00If you think about utility,
  • 12:02however,
  • 12:02utility I think is more closely
  • 12:04aligned with the model's purpose
  • 12:06and that's answering the research
  • 12:08question and adding
  • 12:10more real world value.
  • 12:11It's going to look at the bigger picture
  • 12:13and make very creative adjustments
  • 12:15to align with the ultimate research
  • 12:17goal and real world application,
  • 12:19and I hope to provide some
  • 12:21concrete examples of this.
  • 12:22So bringing all of these different
  • 12:24things together, I really feel it's
  • 12:26an opportunity to sort of pause,
  • 12:28reframe our research questions
  • 12:29and ask why we're doing this,
  • 12:31where we want to be and how we can
  • 12:33get there and and sort of create
  • 12:36some criteria for for that journey.
  • 12:38You know, in the process of setting
  • 12:39up some of our optimization problems,
  • 12:41I think we've all been so excited.
  • 12:43Like this is so cool,
  • 12:44we protect the brain from behavior,
  • 12:45but we've convinced ourselves
  • 12:46that really makes sense,
  • 12:47just optimize for accuracy.
  • 12:49But that's just because it's more
  • 12:50easily mathematically formulated.
  • 12:52The utility.
  • 12:53And if you consider the bigger picture again,
  • 12:56you think about you know our our
  • 12:57goal isn't to become infinitely
  • 12:59more accurate or intelligent.
  • 13:01Really,
  • 13:01our goal is to do useful things
  • 13:03that makes life easier for humans,
  • 13:04and that would be closely
  • 13:07aligned with utility.
  • 13:08OK,
  • 13:08so I just want to dive a little bit more
  • 13:10into accuracy and some of the limitations.
  • 13:13But first I have to
  • 13:14mention how we measure it.
  • 13:15I think it's become clear,
  • 13:16but it's really just a single
  • 13:18metric that represents the models
  • 13:19ability to predict some kind of
  • 13:21observation and a test set of data.
  • 13:23If you're working in a
  • 13:24classification setting,
  • 13:25this might actually be accuracy.
  • 13:27If you're in a regression setting,
  • 13:28this might be things like mean
  • 13:30squared error or the correlation
  • 13:32between predicted and observed.
  • 13:34And this is an extreme simplification again
  • 13:36of the model's performance and traits.
  • 13:38And it really doesn't capture
  • 13:39any other values.
  • 13:40Things like reliability, validity,
  • 13:43complexity, fairness, usability.
  • 13:47We can also think of it again
  • 13:50in this loss function setting.
  • 13:51It's one to point out and kind of the
  • 13:53same one metric that we're using in
  • 13:55our training and testing to say like
  • 13:57this has been not the perfect model,
  • 13:58but then it's the same metric we
  • 14:00use in this out of sample test set
  • 14:03to determine the generalizability.
  • 14:05We can also think of accuracy
  • 14:07in terms of benchmarking.
  • 14:09This is a little bit more abstract.
  • 14:10It's when we go to compare one
  • 14:12paper or one model, you know,
  • 14:13my lab versus your lab,
  • 14:15I have a better model than yours.
  • 14:16And it's this, you know,
  • 14:17comparison of 1 model to another.
  • 14:19And we're searching for the very
  • 14:21best model and that's determined by
  • 14:23being the most accurate or having,
  • 14:25you know,
  • 14:26the best performance on this
  • 14:28one single metric.
  • 14:29And this kind of contributes to this.
  • 14:31What I see is this very slow progress,
  • 14:33just, you know,
  • 14:34most models aren't even statistically
  • 14:36significantly improving.
  • 14:37Accuracy, it's like, you know,
  • 14:39a 1% increase of things.
  • 14:41So it's kind of led us to this.
  • 14:42We feel like we're making progress,
  • 14:44but it's really slow and we're
  • 14:45kind of actually not
  • 14:46true, really making any progress.
  • 14:48We're just sort of inching along.
  • 14:50So I think some of the limitations of
  • 14:52accuracy have already become considered,
  • 14:54but I'll just go over to them a little bit.
  • 14:57I think there is not a definition of success.
  • 15:00And if we don't have a definition of success,
  • 15:02you know our goal isn't to become
  • 15:04infinitely more accurate or intelligent
  • 15:05and without this clear definition
  • 15:07of our goals and vision of success,
  • 15:09probably ever know when we reach it.
  • 15:11What does it even mean to become
  • 15:14infinitely more intelligent or accurate?
  • 15:15What purpose would it serve to
  • 15:17be in a world full of agents,
  • 15:19machines and humans that are super
  • 15:21intelligent and just to throw out
  • 15:23good hearts law that says when a
  • 15:24measure becomes a target it ceases
  • 15:26to be a good measure measure?
  • 15:30Sorry for those of you who don't know,
  • 15:31headless.
  • 15:32So it's one of my favorite TV
  • 15:33shows and I just wanted to,
  • 15:34I think you can understand this example
  • 15:37because it's really just about soccer
  • 15:39and this is just a very abstract
  • 15:41definition of of limitations of accuracy.
  • 15:43If you consider the goals of the soccer
  • 15:45team that's playing in the Premier League,
  • 15:46their long term goal is to really,
  • 15:48you know,
  • 15:48keep winning and have healthy players that
  • 15:50are getting along well with each other.
  • 15:52And if you consider one player that's
  • 15:54like star player who's really just
  • 15:56thinking about their abilities,
  • 15:57their performance.
  • 15:58That's Jamie Tart and from from Ted Lasso.
  • 16:02It might be good in the short term,
  • 16:04like they're going to win the game,
  • 16:05but every other player on the
  • 16:06team's going to hate them.
  • 16:07It's going to kind of contribute to this
  • 16:10bit toxic culture versus the team captain,
  • 16:13right?
  • 16:13Ken,
  • 16:14who's not just considering their own needs
  • 16:16and is a bit more broadly focused on.
  • 16:18They're really better for the overall
  • 16:20goal of the long term of of sustainability.
  • 16:23And a few more lastly,
  • 16:25concrete examples of why accuracy
  • 16:27hasn't been working for us.
  • 16:29We've obviously been chasing
  • 16:31hierarchy high accuracy,
  • 16:32but having high accuracy does not
  • 16:34imply that we have reproducibility,
  • 16:37that our models are meaningful,
  • 16:38meaning that the feature we're
  • 16:40using are better than random.
  • 16:41It doesn't come with any explain ability,
  • 16:44and having equal accuracy does not imply
  • 16:46that two models learned in the same way.
  • 16:52OK. So away from accuracy towards utility,
  • 16:56which I think is where we need to be
  • 16:58before I talk about measuring utility,
  • 17:01it's really important to say that
  • 17:03it's before you can measure utility,
  • 17:05you really have to look at yourself
  • 17:07and say what are our core values?
  • 17:08And these are the values that I
  • 17:10mentioned earlier and not only
  • 17:12identify what the values are,
  • 17:13but rank them in priority,
  • 17:15what's the most important to you?
  • 17:17And I also just want to point
  • 17:18out that this is not going to be
  • 17:20something that's consistent across
  • 17:21all research questions, all models.
  • 17:23It's very complex,
  • 17:24dependent and even within your own
  • 17:26work it might change depending on day,
  • 17:29week or paper.
  • 17:30So it's a very iterative process.
  • 17:34For the purposes of measuring utility,
  • 17:36we haven't really been doing
  • 17:37it very much in our own field.
  • 17:38So I sort of looked to a couple of other
  • 17:41fields that happened during this work more.
  • 17:43And the first field that I found
  • 17:45was actually the ethical AI field.
  • 17:47And I found this really wonderful paper
  • 17:49that started to sort of assess what the
  • 17:51values were in the field and they did this.
  • 17:53This was the paper that won best paper
  • 17:55at the leading ethical AI conference,
  • 17:58and it's called the values encoded
  • 18:00in machine learning research.
  • 18:01And what they did this graph
  • 18:02is a little bit hard to see.
  • 18:04But the values of the front
  • 18:06represent things like accuracy.
  • 18:07And even though this paper was,
  • 18:08that was evaluated in papers that had
  • 18:11done with this ethical AI conference.
  • 18:13So things like user rights,
  • 18:15ethical principles,
  • 18:15which are in red and purple
  • 18:17are much lower on the chart.
  • 18:19And so this is not really solving the
  • 18:21problem, but was sort of just saying,
  • 18:23you know, what are our values?
  • 18:24Are they represented in our work?
  • 18:25So that's kind of the place
  • 18:27that you have to start.
  • 18:29At the same conference they kind of
  • 18:31another group took it a step further
  • 18:33and said we need to go a little bit
  • 18:35beyond just accepting like what the
  • 18:37existing values are in the literature.
  • 18:39We need to design A framework for how we can,
  • 18:42you know,
  • 18:42assess them consistently going forward.
  • 18:44So they introduced a paper called
  • 18:46towards the multi stakeholder
  • 18:48value based assessment framework
  • 18:50for algorithmic systems and I just
  • 18:52want to briefly walk you through it
  • 18:54because it has really inspired me.
  • 18:56So you first have to start by kind
  • 18:57of laying out what your values are
  • 18:59and they decided to put this on the
  • 19:02wheel because some of your values are
  • 19:03going to conflict with each other.
  • 19:05Now this might not all these values
  • 19:07might not work for our field,
  • 19:08but you could think about updating
  • 19:10them for our field.
  • 19:11So here they're primary value that
  • 19:13they cared about was privacy and
  • 19:16privacy is kind of in conflict
  • 19:18with transparency maybe.
  • 19:18So values that are on opposite
  • 19:20sides of the wheel,
  • 19:22you know you can't prioritize
  • 19:23them at the same time.
  • 19:24So first step is identifying.
  • 19:26What's your top value?
  • 19:27You then have to kind of go
  • 19:29on to taking that value,
  • 19:31identifying the criteria of
  • 19:32that value and how it manifests.
  • 19:34So in the example of privacy,
  • 19:37some of the criteria might look like a
  • 19:39data protection right to erase the data,
  • 19:41and some of the more specific manifestations
  • 19:44might be things like purpose,
  • 19:46statement of data collection,
  • 19:47statement of how long the data is kept.
  • 19:49So clearly this is a bit messier
  • 19:51of a process,
  • 19:52that one simple formula for accuracy,
  • 19:54but it's messy and necessary.
  • 19:59After we kind of measured it and
  • 20:00know what we're optimizing for,
  • 20:02we can move on to actually optimizing it.
  • 20:04I'm going to provide some concrete examples,
  • 20:07again sort of from the ethical
  • 20:09AI machine learning field,
  • 20:11but I'll try to also bring it to our field.
  • 20:13So in this example,
  • 20:15fairness was a priority value.
  • 20:16This is from a paper that was published,
  • 20:18I think about two years ago in
  • 20:21nature medicine, and they took.
  • 20:24The MRI's and looked to predict
  • 20:27the patient pain score instead
  • 20:29of the radiologist diagnosis.
  • 20:31And this is still a supervised
  • 20:33learning problem.
  • 20:34So they're from an image predicting a score,
  • 20:37but the score that most people
  • 20:39predict is this radiologist diagnosis,
  • 20:41which is a standard measure of pain severity.
  • 20:44But in the same data set they
  • 20:46also had patient self reported
  • 20:47pain and what does this simple,
  • 20:49you know,
  • 20:49switching up the target led to with them
  • 20:52discovering that relative to the radiologist.
  • 20:54Diagnosis,
  • 20:54which only accounted for 9% of
  • 20:57racial disparities and pain using
  • 20:59the self reported pain accounted
  • 21:01for 43% or almost five times
  • 21:03more racial disparities and pain.
  • 21:05So still the supervised learning problem,
  • 21:07they're actually still optimizing for
  • 21:08something like mean squared error,
  • 21:10but in just switching this some
  • 21:12kind of a creative way,
  • 21:13they were able to more align
  • 21:15with what their value was,
  • 21:16in this case fairness.
  • 21:19This is another example of fairness.
  • 21:20I'll go through it quickly.
  • 21:23They looked at equal opportunity
  • 21:26and multi objective optimization.
  • 21:28This is in the same fact conference
  • 21:30of ethical AI and this was again
  • 21:32a supervised learning framework
  • 21:33and they wanted to not necessarily
  • 21:35pay a trade off in fairness for
  • 21:37accuracy that realized that both
  • 21:39were really important.
  • 21:40So they set up a really nice joint
  • 21:43optimization problem where they
  • 21:44could find the optimal solution that
  • 21:46represented both values as a bit complicated.
  • 21:49I have a lot more examples about
  • 21:51this if you want to talk more later.
  • 21:53The next last example I'm going
  • 21:55to talk about is,
  • 21:56other than fairness being a value
  • 21:57that we want to optimize for,
  • 21:59maybe things like efficiency and
  • 22:01usability might be our priority value.
  • 22:03I've been inspired by the human computer
  • 22:07interface literature specifically.
  • 22:08There's been a lot of work of learning to
  • 22:10optimize for teamwork or recognizing that,
  • 22:13and this kind of goes in the medical example.
  • 22:15It's never going to be just AI alone.
  • 22:17There's always going to be a
  • 22:19person there that's communicating.
  • 22:20And how can we optimize for
  • 22:22that relationship? How can we?
  • 22:24Not only get people to like
  • 22:26doctors to use the model,
  • 22:27but how can we introduce the
  • 22:29information to them at the right time?
  • 22:31So there was a couple of papers
  • 22:33from Microsoft that really tackled
  • 22:34this problem deeply and came up
  • 22:36with some really nice solutions.
  • 22:38And just to share one example of
  • 22:40the brain predictive modeling field,
  • 22:42this comes from my own work where I've
  • 22:45really focused on not just like sharing code,
  • 22:47which is a nice step,
  • 22:48but how do I actually make these things
  • 22:51accessible and to people without the
  • 22:53same computational skills or resources.
  • 22:56So my first two papers focused on sharing
  • 22:58pre trained models of normative modeling.
  • 23:01I then took a step further to kind of
  • 23:04write a protocol of how to do it and
  • 23:07even a step further designing a website.
  • 23:10Where people can just upload CSV,
  • 23:12click and drop things. So again,
  • 23:14all of these examples are very creative.
  • 23:16Not as straightforward solutions as accuracy,
  • 23:19but they really get us closer
  • 23:21to what we truly care about.
  • 23:23Hopefully some of the benefits
  • 23:25have come across already.
  • 23:26But just to say this concretely,
  • 23:28I think if we think more
  • 23:30about optimizing for utility,
  • 23:31it becomes more collaborative,
  • 23:33efficient and there's just
  • 23:35more well defined purpose.
  • 23:37Is functional has ruled out the
  • 23:39meaning rather than attractive,
  • 23:41shallow and surface level
  • 23:42appeal that intelligence has.
  • 23:44It's really an opportunity to
  • 23:46think deeply and align our models
  • 23:48with what we really care about.
  • 23:50And creative thinking and problem
  • 23:52solving this recall is required.
  • 23:54It's going to be more of a challenge,
  • 23:56but that also means I think our
  • 23:57solution will be much more satisfying.
  • 23:59OK,
  • 24:00there's obviously some roadblocks
  • 24:01that are involved in this process.
  • 24:03We have a lot of cognitive biases that
  • 24:06might make us focus on much simpler
  • 24:08problems as problem complexity increases,
  • 24:10which tend to shift our responsibility
  • 24:13and think along the lines of
  • 24:14this is somebody else's problem.
  • 24:16Maybe especially in academia,
  • 24:17we think this is more of an industry problem.
  • 24:19But actually in especially mental health,
  • 24:22there's not that many people
  • 24:23in industry doing this.
  • 24:24It's it's mostly within academia.
  • 24:26So I think it really,
  • 24:28it is our responsibility.
  • 24:29Obviously making the utility
  • 24:31explicit is more difficult,
  • 24:33but when you were scientists,
  • 24:34we love a challenge,
  • 24:35so I challenge us.
  • 24:37The next roadblock is kind of where
  • 24:39things happen in development versus
  • 24:41where they're really deployed
  • 24:42and when they become useful.
  • 24:44So during development,
  • 24:45we have often a single decision maker.
  • 24:47This is often the person,
  • 24:49the machine learning engineer or
  • 24:51the PhD student training the model.
  • 24:53We're often in a situation where
  • 24:54the data is stationary because we're
  • 24:56working with secondary data analysis.
  • 24:58So it's kind of fixed.
  • 24:59It is the data.
  • 25:00We can't necessarily change it,
  • 25:02but really in the real world,
  • 25:04there's going to be many people involved.
  • 25:05There's going to be.
  • 25:07Simon mentioned maybe a,
  • 25:08A you know medical doctor,
  • 25:10a patient involved.
  • 25:11All of these people will have different
  • 25:12values and different value priority queues.
  • 25:14So it's going to there's no way around it.
  • 25:16It's going to be complicated and
  • 25:18requires a lot of iteration and open
  • 25:21discussion and and valuing diverse you
  • 25:23know opinions and mapping them out.
  • 25:26Also our data of course in the real
  • 25:29world doesn't often incomplete
  • 25:31and SCN is always evolving.
  • 25:33And again another thing,
  • 25:35I've kind of brought this up
  • 25:37when I mentioned the multi
  • 25:38objective optimization.
  • 25:39Sometimes we think about our values
  • 25:41as in conflict with another.
  • 25:44Like if I focus on fairness,
  • 25:45I'll pay off the price and accuracy.
  • 25:47And then if I go to having, you know,
  • 25:50more than just two values that I care about,
  • 25:53it becomes infinitely harder and you
  • 25:55can't optimize for everything at once.
  • 25:56And I think this can be fixed with just
  • 25:58really sitting down and for each problem
  • 26:00we're trying to solve focusing on.
  • 26:02What is most important to us?
  • 26:03And again, this is not always
  • 26:06going to be the same value.
  • 26:08And done a few future directions
  • 26:11in this work.
  • 26:12I've been really inspired
  • 26:13by a lot of other fields.
  • 26:14I've been thinking about utility,
  • 26:16defining it and optimizing for it already
  • 26:18and we really need to learn from them.
  • 26:20I showed you 2 examples
  • 26:22from machine learning,
  • 26:23the ethical AI papers and the
  • 26:25human computer interaction papers.
  • 26:27There's also behavioral economics which
  • 26:29is really attached mapping utility
  • 26:31functions and modeling them mathematically.
  • 26:34So I think we can look to them for
  • 26:36inspiration and then the value
  • 26:37based Healthcare is really tackled.
  • 26:39How to measure very complex outcomes
  • 26:41so we can also be inspired by them?
  • 26:45And again,
  • 26:46everything that we're talking about
  • 26:47with utility and our value priorities
  • 26:50is going to depend on the context.
  • 26:52And we really need to keep open
  • 26:54communication and guidelines
  • 26:55about making these decisions.
  • 26:56I think the field has been
  • 26:58really great about, you know,
  • 26:59setting up like standards for
  • 27:01reporting methods or, you know,
  • 27:02recognizing these things.
  • 27:03But we haven't really worked on it in
  • 27:05this post analysis machine learning space.
  • 27:07So I think we're ready for it.
  • 27:10And just some take home messages,
  • 27:12we've really had too much of a
  • 27:14tunnel vision focus on the accuracy
  • 27:16of our predictive models,
  • 27:17which has made us lose track of why
  • 27:19we're doing this and it's sort of
  • 27:21created this lack of model utility.
  • 27:23I think it needs to be a priority
  • 27:25to define our values,
  • 27:26which is going to build a better plan
  • 27:28for moving towards these goals and values.
  • 27:30And optimizing for utility
  • 27:31is really an abstract,
  • 27:32creative process that requires
  • 27:34diverse perspectives and input,
  • 27:36and it's going to be a very iterative,
  • 27:37ongoing process.
  • 27:39And with that,
  • 27:40I just want to acknowledge my team and
  • 27:42another one that's been supportive.
  • 27:44Also the team I work with in Ann Arbor,
  • 27:46MI and my dog Charlie Mott,
  • 27:48who is really accompanied me on
  • 27:49all of these thought experiments.
  • 27:59Any questions for a sage? Right.
  • 28:03I quite like the message. Give me a lot of.
  • 28:10One of the maybe I missed something.
  • 28:14I'm talking about. Applying and driving
  • 28:16where there's lability is part of
  • 28:19the the set apartment component of.
  • 28:24Yeah. So the question was like.
  • 28:29OK. Yeah, lots to think about with this very.
  • 28:32Yeah as I mentioned the philosophical talk,
  • 28:34but the question was I'm saying
  • 28:36we need to optimize for utility.
  • 28:38But then you brought up like explain
  • 28:40ability and how that fits in.
  • 28:41It's sort of utility I see as a as
  • 28:44a manifestation of what our values
  • 28:46are and that's context dependent.
  • 28:48Explain ability might be that value might
  • 28:51actually be utility or usefulness but I'm
  • 28:54sort of using utility as this like you know,
  • 28:57you can sort of think of it.
  • 28:58Our values.
  • 28:59So explain ability falls into it,
  • 29:01but it might not always be explain ability.
  • 29:05For example, a lot of times these models are
  • 29:08are beautiful people they can provide us.
  • 29:11Combine just ridiculous
  • 29:14source provides insight.
  • 29:16Doesn't really tell the plot moves
  • 29:18that that my ideas would do.
  • 29:20We actually learned learned about
  • 29:22the system from the particular model.
  • 29:24So it's going and sort of
  • 29:25that that works and why.
  • 29:27Yeah, I think explain ability is
  • 29:29really important and also accuracy
  • 29:31could be could fall within you know
  • 29:34utility like having an accurate model
  • 29:36is really useful if it's completely,
  • 29:38you know always worse than chance,
  • 29:40it's not that useful model.
  • 29:41So I'm not saying that accuracy is wrong,
  • 29:44but it's just it's only been focusing
  • 29:45on that and there's a lot of other
  • 29:47values that we have that I think we
  • 29:49need to consider and maybe they need
  • 29:51to turn the first like explain ability
  • 29:53especially with the neuroscience might
  • 29:55be one that you know takes priority.
  • 29:58I think the comment on that,
  • 30:00there's a there's a lot of that in
  • 30:01medicine now like there's tons of
  • 30:03you know they use drugs all the
  • 30:04time on off label things and they
  • 30:06don't really know why they work,
  • 30:07but they work and the key
  • 30:09thing is that they work right.
  • 30:10And then in terms of if you put
  • 30:13this in terms of radiologic.
  • 30:16Thought I guess depending on the application,
  • 30:19right, whether you have a screening
  • 30:21application or a diagnostic application,
  • 30:22you'll accept, you know,
  • 30:23different rates of false positives
  • 30:25or false negatives.
  • 30:26So I think that it is essential
  • 30:28to keep in mind what the task is,
  • 30:31what goals are,
  • 30:32and there's levels of explain
  • 30:33ability, right, you know?
  • 30:36Probably the mechanistic insight. Because.
  • 30:40It's very important that you guys.
  • 30:45Observation. Yeah.
  • 30:50Yeah, there's countless
  • 30:52examples of that not being true.
  • 30:55Well, I mean do you think that?
  • 30:58You know something?
  • 31:00White matter tract alleviate depression.
  • 31:02You know, maybe it works some of the time.
  • 31:05So hey, it works. And if it doesn't
  • 31:06work we'll try something else.
  • 31:08They can stick a vagal nerve
  • 31:10stimulator and stop epilepsy.
  • 31:11Do they know how that works now?
  • 31:12But it works.
  • 31:15With leads to a much better.
  • 31:18Yeah, sure.
  • 31:19That's why we do the mechanism.
  • 31:25Or whatever. I don't
  • 31:26want to take over stages.
  • 31:29I just want to say also I think like I
  • 31:31think they're Ness has become a value.
  • 31:32There's been a lot of you know,
  • 31:34Todd's nature papers, a great example.
  • 31:35Simon brought up a couple of ones.
  • 31:37But is, you know, is it expandability?
  • 31:40It's, you know, doesn't hold across a
  • 31:42lot of different subgroups and things.
  • 31:43We're not even measuring a
  • 31:44lot of these constructs.
  • 31:45Well, so yeah, I I don't have.
  • 31:48I only have examples, not answers.
  • 31:50But my purpose is really to say we
  • 31:52need to be talking and thinking.
  • 31:54About this before moving forward,
  • 31:55Simon Ohh and yeah, repeat Simon's
  • 31:57question or speak that we have.
  • 32:03Are we actually at a stage where we
  • 32:06can make compromises between accuracy
  • 32:08and utility if our accuracy for most
  • 32:12clinical diagnosis is less than 80% or
  • 32:16for behavioral predictions like R = .28?
  • 32:19I mean, if we give up anything
  • 32:21from already close to nothing,
  • 32:23we are left with nothing?
  • 32:26What the increase of the questions that
  • 32:28you know should we not focus on accuracy
  • 32:30if we're already really bad at accuracy.
  • 32:32Again I I sort of started thinking about
  • 32:34this not only in the neuroscience like
  • 32:36brain behavior predictive modeling field.
  • 32:38I really was thinking about it as machine
  • 32:40learning and whole edit at a hole in
  • 32:42our our hyper focus on becoming super
  • 32:44intelligent and a general intelligence.
  • 32:46So I was thinking a bit broader than that
  • 32:48and that's really where this is coming from.
  • 32:50But again I think accuracy can still be
  • 32:53a value, it's just not our only value.
  • 32:57Well, it has, Breakspear pointed out.
  • 32:59The bar is pretty low, so
  • 33:03we can predict paying quite well.
  • 33:05Or I mean, maybe arguably.
  • 33:10Thank you. Much better, Jesse.
  • 33:15See for example example. Explain.
  • 33:21Umm. Would you recommend?
  • 33:25Altering how you do it on the input side,
  • 33:27like you know kind of features you use,
  • 33:29how you segregate groups or whatever,
  • 33:31or you advocate doing something
  • 33:34on the implementation side,
  • 33:36changing their algorithm.
  • 33:38Like what do you think is the. I think
  • 33:40there's not one answer to it.
  • 33:42I think going back to that machine
  • 33:44learning lifecycle, it really has to
  • 33:46start with defining the objective.
  • 33:48What are we predicting the right
  • 33:50thing with the right features,
  • 33:51with the right outcome.
  • 33:52You know, and that radiologist example,
  • 33:54all they did, they were still using
  • 33:56the same input of knee MRI and all
  • 33:58they did was change the target.
  • 34:00So instead of predicting the radiologist
  • 34:02diagnosis, they predicted pain,
  • 34:03you know, self reported pain.
  • 34:05But that's that's not something
  • 34:07I can give an answer. Always.
  • 34:09It's always going to be dependent
  • 34:11on what's available.
  • 34:12What are you prioritizing for?
  • 34:14It's just it really requires
  • 34:16careful thought about,
  • 34:17you know what,
  • 34:18what are you trying to change
  • 34:19and what do you value us?