Skip to Main Content

Pathology Grand Rounds, October 24, 2024

October 28, 2024

Pathology Grand Rounds from October 24, 2024, featuring Faisal Mahmood, PhD

ID
12263

Transcript

  • 00:00Good afternoon, everyone.
  • 00:05It's a great pleasure today
  • 00:06to host, doctor Faisal Mahboud,
  • 00:09for the Department of Pathology
  • 00:11Grand Rounds.
  • 00:13He no he needs no
  • 00:14introduction for most of us
  • 00:15who work in anatomic pathology,
  • 00:17who have watched with all
  • 00:18the
  • 00:19high impact and transform transformational
  • 00:21work that he and his
  • 00:22group have been engaged in
  • 00:23in the last few years.
  • 00:26Doctor. Mahmood is an engineer
  • 00:28scientist
  • 00:29after obtaining a bachelor's in
  • 00:31electronic engineering from the GIK
  • 00:33Institute of Engineering Sciences and
  • 00:35Technology in Pakistan.
  • 00:37He went on to get
  • 00:38a PhD in biomedical engineering
  • 00:39from the Okinawa Institute of
  • 00:41Science and Technology.
  • 00:43He then did his postdoctoral
  • 00:45work in biomedical engineering at
  • 00:47Johns Hopkins,
  • 00:49where he started using his
  • 00:50considerable
  • 00:52expertise to address problems associated
  • 00:54with using deep learning for
  • 00:56medical imaging,
  • 00:57particularly for endoscopy and pathology
  • 00:59applications.
  • 01:02He was then recruited,
  • 01:03to the Department of Pathology
  • 01:05at Harvard Medical School as
  • 01:06an assistant professor in two
  • 01:08thousand nineteen,
  • 01:10where he has risen quickly
  • 01:11to the rank of associate
  • 01:13professor
  • 01:14with appointments at Mass General,
  • 01:16the Brigham,
  • 01:17DFCI,
  • 01:18and the Broad Institute.
  • 01:21At Harvard, he now helms
  • 01:22a large laboratory that's extensively
  • 01:24funded by multiple agencies, including
  • 01:26the NIH.
  • 01:28The main focus of his
  • 01:29lab is in developing AI
  • 01:31tools for pathology image analysis.
  • 01:34Doctor. Mahmood and his group
  • 01:36have published extensively in this
  • 01:37area with multiple high impact
  • 01:39publications in top tier journals.
  • 01:41Actually, by an informal count,
  • 01:43I think in this past
  • 01:44calendar year, he has
  • 01:46one Nature, one Cell, three
  • 01:48Nature Medicine papers, and that's
  • 01:50just an informal count,
  • 01:51and not mentioning some of
  • 01:52the others.
  • 01:54It's not surprising then that
  • 01:56he has been the recipient
  • 01:57of multiple awards, including the
  • 01:59Outstanding Investigator Award from the
  • 02:01NIH and NIGMS
  • 02:04in twenty twenty.
  • 02:06He's on the editorial boards
  • 02:07of journals, including the AJP.
  • 02:10He's the holder of several
  • 02:11patents.
  • 02:13And he has been a
  • 02:14mentor to several graduate students,
  • 02:16trainees, and postdocs.
  • 02:17And actually at last night
  • 02:18in dinner, he told us
  • 02:19the accomplishment that he's most
  • 02:21proud of
  • 02:22is the fact that every
  • 02:24postdoc he has trained has
  • 02:25gone on to set up
  • 02:25an independent lab.
  • 02:27A testimony, I think, to
  • 02:29his dedication to science and
  • 02:31mentorship.
  • 02:32We're very glad that he
  • 02:34took some time to from
  • 02:35his busy schedule to make
  • 02:37the trip down from Boston
  • 02:39to visit with us today
  • 02:40and
  • 02:41present and share some of
  • 02:42the work that he and
  • 02:43his group have
  • 02:45been engaged in.
  • 02:46The title of the talk
  • 02:47is multimodal
  • 02:48generative and agentic
  • 02:50AI for pathology.
  • 02:52Doctor. Mahdin.
  • 02:54Okay.
  • 02:58Thank you so much for
  • 02:59the introduction and for inviting
  • 03:00me to speak today.
  • 03:02So I'll be talking a
  • 03:03little bit about some of
  • 03:04the work that my group
  • 03:05has been doing over the
  • 03:06past, six years or so
  • 03:07in computational pathology.
  • 03:10I'll talk about some of
  • 03:11the older work that we
  • 03:12did,
  • 03:13and how it sort of
  • 03:14led into some of the
  • 03:15more recent work that's happened
  • 03:16over the over the past
  • 03:17two years. So there's just
  • 03:19a quick outline for for
  • 03:20the talk. I'll talk a
  • 03:21little bit about weekly supervised
  • 03:22learning for pathology,
  • 03:25multimodal data integration and foundation
  • 03:27models, generative AI, transitioning from
  • 03:29two d to three d
  • 03:30pathology, and bias and fairness
  • 03:31in in, machine learning algorithms
  • 03:33for pathology. So,
  • 03:36a quick, note about about
  • 03:38the problem formulation. So what
  • 03:39what what are we trying
  • 03:40to do? We're we're we're
  • 03:41essentially trying to make sense
  • 03:43of lots and lots of
  • 03:43pathology data, images that look
  • 03:45like this. I think everyone
  • 03:46here is familiar with, with
  • 03:48with with pathology images, but
  • 03:50they're more like satellite images.
  • 03:51They're they're hierarchical. They hold
  • 03:53they hold information at multiple
  • 03:55multiple levels. They're very different
  • 03:57from,
  • 03:58conventional computer vision kind of
  • 04:00images that are used for
  • 04:01to to develop machine learning
  • 04:02algorithms. And the hope that
  • 04:04we have is from just
  • 04:05to go from large cohorts
  • 04:06of these images to everything
  • 04:08in the red box here.
  • 04:09So early diagnosis, prognosis,
  • 04:11prediction of response
  • 04:12to treatment,
  • 04:15integrated biomarker discovery, and so
  • 04:16forth. So the
  • 04:19what what we are essentially
  • 04:20trying to do is train
  • 04:21models using these images and
  • 04:23labels that reside in pathology
  • 04:25reports because
  • 04:26they're the cheapest labels available.
  • 04:28We we don't have lots
  • 04:29and lots of, pixel level
  • 04:30annotations for these for these
  • 04:32images.
  • 04:33So some of the earlier
  • 04:34work we we did was
  • 04:36around weekly supervised learning using
  • 04:38whole slide images,
  • 04:39slide level labels, and trying
  • 04:41to make the make a
  • 04:42conventional multiple instance learning based
  • 04:44setup a little bit more
  • 04:45more data efficient. So we
  • 04:47used attention based multiple instance
  • 04:48learning, pretrained encoders
  • 04:50to see if if we
  • 04:51can improve data efficiency. And
  • 04:53this pipeline has been used
  • 04:54extensively.
  • 04:55So it's been used in
  • 04:57over six hundred studies.
  • 04:59We've seen it being used
  • 05:00for with every major organ
  • 05:01organ system all the way
  • 05:03up to forensics. So it's
  • 05:04very exciting to see how
  • 05:05people are beginning to to
  • 05:06to use this. But,
  • 05:09a few applications that we,
  • 05:11we we sort of
  • 05:14targeted when when we first
  • 05:15developed this, the first one
  • 05:16was for cancer defined in
  • 05:17primary where we tried to
  • 05:19show that conventional h and
  • 05:20e images could be used
  • 05:21to predict what the origin
  • 05:23of the tumor may be
  • 05:24or give indications for what
  • 05:25would be the top top
  • 05:27three, top five predictions. And
  • 05:29then those predictions could be
  • 05:30could be used to and
  • 05:32then the origin could be
  • 05:33confirmed with additional ancillary tests.
  • 05:35That that's what led to
  • 05:36this study, and we did
  • 05:37a lot of analysis on
  • 05:38internal cohorts, external cohorts. This
  • 05:40really confirmed that it was
  • 05:42possible to predict origins directly
  • 05:44from histology.
  • 05:46Before
  • 05:47this work was done, there
  • 05:48are a number of different
  • 05:49studies showing that you could
  • 05:50predict,
  • 05:51origins from lots of molecular
  • 05:53data,
  • 05:54and studies showing that you
  • 05:55could predict molecular alterations directly
  • 05:57from histology. So if the
  • 05:58two statements are true, it
  • 06:00should be possible to predict
  • 06:01origins directly from,
  • 06:03from from from metastatic images.
  • 06:05So that's what we showed
  • 06:05in this study.
  • 06:07More recently, what we have
  • 06:09done is that we've tried
  • 06:09to expand this into a
  • 06:11multimodal
  • 06:12integration based analysis where we're
  • 06:14trying to integrate,
  • 06:16histology images with molecular data
  • 06:19to see if we can
  • 06:20if we can improve origin
  • 06:21prediction, and we showed that
  • 06:22that was that was possible.
  • 06:24And then just speaking of
  • 06:25multimodal, we're also interested in
  • 06:27combining histology images with with
  • 06:29molecular data to see if
  • 06:30we can combine,
  • 06:32combine the two to improve
  • 06:33outcomes. So,
  • 06:35it's been shown extensively that
  • 06:37you can use histology whole
  • 06:38slide images
  • 06:39and,
  • 06:40and separate patients into distinct
  • 06:42distinct risk groups,
  • 06:43using weekly supervised algorithms. And
  • 06:45you could do the same
  • 06:46with, with molecular data, whether
  • 06:48it's NGS or a combination
  • 06:49of NGS and transcriptomic data.
  • 06:52In this study, we wanted
  • 06:53to see if we could
  • 06:53if you could do this
  • 06:56exhaustively in a pan cancer
  • 06:57manner,
  • 06:59for lots of different cancer
  • 07:00types. And we showed that
  • 07:01you can separate patients into
  • 07:02into distinct risk groups and
  • 07:04then improve patient stratification by
  • 07:06integrating additional modalities. But then
  • 07:07you can also go back
  • 07:08and look at what was
  • 07:09important in the molecular profile,
  • 07:11what's important in the in
  • 07:12the histology, and further quantify
  • 07:14what the model essentially
  • 07:15pays attention to and using
  • 07:17interpretability as a as a
  • 07:18discovery mechanism. So there are
  • 07:20lots of lots of,
  • 07:22interesting findings in this, in
  • 07:24this study.
  • 07:26We also
  • 07:28did some work on
  • 07:30endomyocardial biopsy assessment trying to
  • 07:32see if we can improve,
  • 07:35standardization
  • 07:36for,
  • 07:38for cardiac biopsies
  • 07:40after after heart transplants.
  • 07:42After patients get get a
  • 07:44heart transplant, they would often
  • 07:45get repeated endomyocardial biopsies to
  • 07:47see if the donor heart
  • 07:48is being rejected by the
  • 07:49by the recipient and it's
  • 07:50a problem where there's a
  • 07:51large scale intra and intra
  • 07:53observer variability. So this study,
  • 07:55we try to focus on,
  • 07:57just assessing whether human in
  • 07:59the loop AI can improve
  • 08:01standardization.
  • 08:02So I'll refer you to
  • 08:03the article for for more
  • 08:05more in-depth analysis on this.
  • 08:07But the idea of showing
  • 08:09all of these examples is
  • 08:10that most of the
  • 08:12studies in computational pathology follow
  • 08:14some form of this this
  • 08:15framework where you have,
  • 08:18images, pathology slides that are
  • 08:19digitized into whole slide images.
  • 08:21And the whole slide images
  • 08:22get processed by segmentation, patching.
  • 08:24And then you have some
  • 08:25form of feature extraction, feature
  • 08:27aggregation and prediction. Right? So
  • 08:29a majority of computational topology
  • 08:30studies have followed this pipeline.
  • 08:32On the technical end, most
  • 08:33improvements are made either in
  • 08:35feature extraction or feature aggregation.
  • 08:37Over the past two years,
  • 08:38we realized that feature extraction
  • 08:40is way more important
  • 08:42in comparison to feature aggregation.
  • 08:43If you have richer features,
  • 08:45you you can,
  • 08:47get to a better prediction.
  • 08:50And the feature extraction is
  • 08:51sort of what's driven the
  • 08:52foundation model evolution pathology and
  • 08:54foundation model are sort of
  • 08:55or
  • 08:56or or hype, but all
  • 08:57they are are they're large
  • 08:58self supervised models.
  • 09:00They extract the feature representations
  • 09:03from these images. And if
  • 09:04you have retrofit representations, you
  • 09:05hope that you can use
  • 09:06fewer data points for downstream
  • 09:08training,
  • 09:10and these models can be
  • 09:11applicable to rare diseases, clinical
  • 09:13trials,
  • 09:14or situations where you just
  • 09:15have fewer fewer data points
  • 09:17available. Foundation models are not
  • 09:18really meant to replace self
  • 09:19supervised task specific models.
  • 09:22They aid with the development
  • 09:23of better task specific
  • 09:27models. And the the the
  • 09:29two foundation models from the
  • 09:31from the group, that came
  • 09:32out in March this year,
  • 09:34We call them Unni and
  • 09:35Panch. It was an image
  • 09:36based model which uses lots
  • 09:38of pathology images,
  • 09:39in a self supervised manner
  • 09:41and an image text model
  • 09:42that contrasts,
  • 09:43images with text to improve
  • 09:46essentially image based feature representation.
  • 09:48So for the image based
  • 09:49model, we use a hundred
  • 09:50thousand whole slide images, a
  • 09:52hundred million patches from these
  • 09:53hundred, hundred thousand whole slide
  • 09:55images is just a distribution
  • 09:57for where all the data
  • 09:58came from. We try to
  • 09:59maximize for diversity. So trying
  • 10:01to deliberately collect
  • 10:03data that,
  • 10:06maximize the diversity we had
  • 10:07in the overall overall dataset.
  • 10:10And we did not use
  • 10:11any public data in in
  • 10:13developing this model. And that
  • 10:14was on purpose because we
  • 10:15wanted to use lots of
  • 10:16public data as as independent
  • 10:19evaluation cohorts.
  • 10:21It was trained using the
  • 10:22Dyno Dyno v two framework,
  • 10:24and it was compared against
  • 10:25a number of other foundation
  • 10:26models and a number of
  • 10:27other,
  • 10:29standardized models that were commonly
  • 10:30used including the ResNet fifty
  • 10:32that's just trained on ImageNet,
  • 10:33which was the most commonly
  • 10:35used model until until very
  • 10:37recently.
  • 10:38We apply this to thirty
  • 10:40three downstream tasks including
  • 10:42for
  • 10:43ROI level classification, segmentation, retrieval,
  • 10:46at the level of the
  • 10:47whole slide and at the
  • 10:48level of regions of interest.
  • 10:50This radar plot is a
  • 10:51really bad way to present
  • 10:53all of these datasets together,
  • 10:55but it does let you
  • 10:56see all the different tasks,
  • 10:58that we apply this to
  • 11:00in one go. The bar
  • 11:01plots are a much better
  • 11:02way to statistically observe the
  • 11:03improvement in performance,
  • 11:05and we showed that there
  • 11:06was about a six to
  • 11:07eight percent improvement,
  • 11:09in performance over,
  • 11:11some of the other models
  • 11:12and a substantial improvement over,
  • 11:14or a ResNet fifty based
  • 11:16based model. And perhaps the
  • 11:18more clinically useful aspect of
  • 11:19this is around few shot
  • 11:20learning where you can use
  • 11:21very few data points to
  • 11:22train models that,
  • 11:25that can have clinical downstream
  • 11:27clinical applicability.
  • 11:28In parallel, we also worked
  • 11:30on a image text model.
  • 11:32So trying to see if
  • 11:33we can contrast with text
  • 11:34to improve improve image based
  • 11:36feature representations.
  • 11:37And we showed that you
  • 11:38could use this model in
  • 11:40a variety of different ways.
  • 11:41This is just a distribution
  • 11:42of all the data that
  • 11:43was used for this. Most
  • 11:44of the data came from
  • 11:45the PubMed open access database
  • 11:47where we use human pathology
  • 11:48images and corresponding captions often
  • 11:50linking with what
  • 11:52where those images are referred
  • 11:53to within the within the
  • 11:55article to expand on those
  • 11:56text captions.
  • 11:57But the richness of these
  • 11:59features were assessed in a
  • 12:00number of different ways, including
  • 12:01with zero shot classification where
  • 12:03the goal is not to
  • 12:04have clinically useful models, but
  • 12:06to see raw features, how
  • 12:07just how rich the raw
  • 12:08features are. And then they
  • 12:10could be used in a
  • 12:11supervised setting for within a
  • 12:12few shot learning or other
  • 12:14kinds of tasks. In this
  • 12:15example, we're showing non small
  • 12:16cell lung cancer subtyping
  • 12:18by using very, very few
  • 12:19samples, how well the model
  • 12:20can perform with just using
  • 12:22four samples or eight samples
  • 12:23for
  • 12:24for for training and whether
  • 12:25we can get to clinically
  • 12:26useful performance
  • 12:27by using fewer samples just
  • 12:28by which of the fact
  • 12:29that the features are much
  • 12:30more rich,
  • 12:32at this point. So we
  • 12:33made both of these models,
  • 12:35publicly available, and we've had,
  • 12:37you know, or or four
  • 12:38hundred thousand downloads over or
  • 12:40hugging face. And the models
  • 12:41have been used around the
  • 12:42around the world for a
  • 12:43variety of different tasks. And
  • 12:45they continue to be used
  • 12:46for a number of different
  • 12:48reasons. There there are there
  • 12:49are there are a few
  • 12:51independent analysis. This analysis is
  • 12:53from Jacob Cather's
  • 12:54group, in Dresden where they
  • 12:56showed that both Unni and
  • 12:57Conch performed quite well on
  • 12:59a number of these,
  • 13:01tasks, both using publicly available
  • 13:03data as well as some
  • 13:03of the internal data that
  • 13:05they that they had. And
  • 13:06we believe that the diversity
  • 13:07of data that was used
  • 13:08for training, which included infectious
  • 13:10inflammatory and neoplastic cases,
  • 13:13is, is what's leading to
  • 13:14the improved,
  • 13:16improved performance. There's some additional
  • 13:18analysis. This this analysis was
  • 13:19from Mount Sinai,
  • 13:21where they showed that
  • 13:23the the model did quite
  • 13:24well for
  • 13:26just a number of different
  • 13:27tasks. So I'll refer you
  • 13:28to these these studies for
  • 13:30for a more in-depth analysis.
  • 13:31So some analysis that I
  • 13:32like to show is around
  • 13:34how fast
  • 13:36the the feature extraction can
  • 13:37be from some of these
  • 13:38models,
  • 13:40and what the storage cost
  • 13:42would be because that's how
  • 13:43you we we can practically
  • 13:44use this. So there there
  • 13:45are there are two
  • 13:48possible
  • 13:49ways we can use these
  • 13:50features. We can use these
  • 13:51features to train new models.
  • 13:53So every slide that scanned,
  • 13:54if you extract features and
  • 13:55keep them,
  • 13:56it can be used to
  • 13:57train new models, but they
  • 13:58can also be used for
  • 13:59model inference. So if you
  • 14:01extract these rich feature representations
  • 14:04from the whole slide image
  • 14:05and then store them, you
  • 14:06can use them both for
  • 14:08model development as well as
  • 14:09for model model inference.
  • 14:11And both UNI and Conch
  • 14:12do quite well in that
  • 14:13in that regard. We already
  • 14:14have
  • 14:15a version of UNI two
  • 14:16and Conch two that we
  • 14:18intend to make public in
  • 14:19the in the coming days.
  • 14:22In parallel, we have been
  • 14:23working on a slide level
  • 14:24foundation model where the goal
  • 14:26is to extract a single,
  • 14:29feature
  • 14:30a single feature vector corresponding
  • 14:32a whole whole slide image.
  • 14:33And it could be used
  • 14:34for a number of different
  • 14:37tasks including for retrieval at
  • 14:39the level of the whole
  • 14:40slide,
  • 14:41and
  • 14:42for very simple classification problems.
  • 14:45It could it could turn,
  • 14:46a lot of these complicated
  • 14:48classification
  • 14:49pipelines into a very simplistic,
  • 14:51classification pipeline if you if
  • 14:53you have the single feature
  • 14:54vector corresponding the whole slide
  • 14:55image. So we're using the
  • 14:57whole slide images for this,
  • 14:58but we're also contrasting with
  • 14:59text.
  • 15:00In this case, we're contrasting
  • 15:01with text that came from
  • 15:03a generative AI model that
  • 15:04we developed
  • 15:05and, the pathology pathology report.
  • 15:09And we we we've shown
  • 15:10that this form of contrasting
  • 15:11leads to substantial improvements or
  • 15:13some of the some of
  • 15:14the other models for morphologic
  • 15:16subtyping, IHC quantification,
  • 15:18biomarker prediction of all of
  • 15:20all sorts, and more importantly
  • 15:22for few shot classification, which
  • 15:23is where the real clinical
  • 15:25utility lies, where you can
  • 15:26use very, very few examples
  • 15:28for
  • 15:29or or very, very few
  • 15:31images for training some of
  • 15:33these,
  • 15:34some of these models.
  • 15:36And we've tested this on
  • 15:37a number of different, right,
  • 15:38difficult
  • 15:40cases including for the for
  • 15:42classifying
  • 15:43all the brain tumors in
  • 15:45the eBrains dataset,
  • 15:46as well as for treatment
  • 15:47response prediction
  • 15:49task. We'll we'll be hopefully
  • 15:51putting this preprint out very
  • 15:52soon. Because it was contrasted
  • 15:54with text, we could also
  • 15:55look at,
  • 15:57how well it does in
  • 15:59terms of zero shot classification,
  • 16:02and for generating the report,
  • 16:04directly from the from the
  • 16:05pathology image or a multitude
  • 16:07of of pathology images. And
  • 16:09we tested this on, you
  • 16:11know, basically the entire TCGA
  • 16:12on a on a internal
  • 16:13dataset that we call OT
  • 16:15one zero eight, which is,
  • 16:17a group of hundred and
  • 16:18eight difficult diagnoses
  • 16:20as well as on on
  • 16:21ebrains and other other datasets.
  • 16:23We plan to release this
  • 16:25model, in the coming weeks.
  • 16:27So people who are interested
  • 16:28in this, stay tuned.
  • 16:31We
  • 16:32so so the story so
  • 16:33far is that we've shown
  • 16:34that you can extract which
  • 16:35feature representations from pathology images,
  • 16:38and there were a multitude
  • 16:39of self supervised models around
  • 16:41that. And we can contrast
  • 16:42with text and improve the
  • 16:44the feature representation. But there
  • 16:46are other, modalities that we
  • 16:48have about around these cases
  • 16:49that we can contrast with
  • 16:50to further improve feature presentation.
  • 16:52And in this particular case,
  • 16:54we're contrasting with IHCs. This
  • 16:55is a,
  • 16:57article that was just presented
  • 16:58at ECCV,
  • 17:00where we contrast pathology images
  • 17:02with, with with the IHCs,
  • 17:06and show that you can
  • 17:07improve each representation and improve
  • 17:08IHC quantification without requiring any
  • 17:11pixel level annotation,
  • 17:13as as well as other
  • 17:14cases like survival and so
  • 17:15forth. So if you can
  • 17:16contrast with text and you
  • 17:17can contrast with the administered
  • 17:19chemistry, you can also contrast
  • 17:20with transcriptomics.
  • 17:22And that's what we showed
  • 17:23in this particular
  • 17:24study. It was published at
  • 17:25CVPR,
  • 17:26earlier this year,
  • 17:28where we showed that that
  • 17:29this was limited to the
  • 17:30TCGA where we show contrasting
  • 17:31H and E images with
  • 17:33the corresponding bulk transcriptomic profile
  • 17:35can improve few shot classification
  • 17:38on lung cancers,
  • 17:40breast cancers, as well as
  • 17:41for, toxicology.
  • 17:44More recently, we've expanded this
  • 17:45to use all of the
  • 17:46molecular data that was available
  • 17:48at Brigham and MGH. So
  • 17:49combining
  • 17:50all the,
  • 17:53all the transcriptomic
  • 17:54data that we have at
  • 17:55at MGH and all the
  • 17:57NGS data that we have
  • 17:58at the, at the Brigham
  • 17:59and Dana Farber and contrasting
  • 18:01with that to see if
  • 18:02we can improve feature representation.
  • 18:04Then we apply this to
  • 18:05a number of different downstream
  • 18:06tasks from mutation prediction, from
  • 18:08from HNE images to molecular
  • 18:10subtyping, and more importantly for
  • 18:12treatment response predictions. If you
  • 18:13look at some of the
  • 18:14results around here for treatment
  • 18:15response prediction,
  • 18:17there's a substantial improvement over
  • 18:19image based image based models
  • 18:21and image text image text
  • 18:22models because transcriptomics represents represents
  • 18:25a form of contrasting that's
  • 18:26much richer
  • 18:28in comparison to text. And
  • 18:29we hope that combining transcriptomics
  • 18:31and text and images leads
  • 18:32to an even even,
  • 18:35richer feature representation.
  • 18:38So,
  • 18:39we'll make this preprint publicly
  • 18:41available very soon. So the
  • 18:43story so far is that
  • 18:44we have shown that we
  • 18:45can contrast,
  • 18:46so so we we can
  • 18:48build large self supervised models
  • 18:49based on histology.
  • 18:51We can contrast them with
  • 18:52text
  • 18:53and get retrofit representations. We
  • 18:55can contrast with IHC
  • 18:56to get rich rich representations,
  • 18:58and we can contrast with
  • 18:59with with transcriptomics,
  • 19:03to continue to improve,
  • 19:05the representation of the image
  • 19:07in terms of its its
  • 19:08its features. But what can
  • 19:09we do with these features?
  • 19:10There are a number of
  • 19:10different things that can be
  • 19:11done. Of course, you can
  • 19:12use it to improve the
  • 19:16Still sharing on Zoom. Someone's
  • 19:18saying that we we can
  • 19:19all see the part.
  • 19:21So we're having technical issues
  • 19:22with Zoom. Why don't you
  • 19:23continue on? Okay.
  • 19:26So,
  • 19:29so,
  • 19:31we've shown that you can
  • 19:32get to richer richer feature
  • 19:34representations using all of these
  • 19:35different contrastive methods. But what
  • 19:37can you do with these
  • 19:37richer feature representations? You can
  • 19:39build better
  • 19:41supervised models, targeted supervised models.
  • 19:43But in parallel, there's been
  • 19:45all this development in multimodal
  • 19:46large language models
  • 19:48where you can have a
  • 19:48single model that harbors a
  • 19:50lot of lot of knowledge.
  • 19:52And
  • 19:53that's what sort of led
  • 19:54to this study because our
  • 19:55our hypothesis
  • 19:56was was that OpenAI is
  • 19:57trying to build a single
  • 19:59model that harbors the world's
  • 20:00knowledge.
  • 20:01So we should be able
  • 20:02to build a single model
  • 20:03that harbors all of human
  • 20:04pathology
  • 20:05knowledge. And what do we
  • 20:06need to do to essentially
  • 20:08get there?
  • 20:09In our in our assessment,
  • 20:11it's like a rich self
  • 20:13supervised model that can extract
  • 20:14rich feature feature representations from
  • 20:16these images
  • 20:17and image text models that
  • 20:18can enhance these feature representations
  • 20:20based on the based on
  • 20:22the text and a large
  • 20:22instruction dataset
  • 20:24of questions,
  • 20:25images, and responses.
  • 20:27And eventually, of course, you
  • 20:28need robust
  • 20:29evaluation.
  • 20:30And our philosophy around this
  • 20:31is that because pathology images
  • 20:33are hierarchical and harbor information
  • 20:35at all of these different
  • 20:36scales, but we don't have
  • 20:37any text information lying around
  • 20:39for
  • 20:40every scale of,
  • 20:42each and every one of
  • 20:43these scales. The the the
  • 20:44only information we have is
  • 20:45essentially
  • 20:57understanding of pathology regions at
  • 20:58a cellular level leads to
  • 21:01slide level and patient level
  • 21:02descriptions.
  • 21:03So we needed to get
  • 21:05annotations.
  • 21:09We need to get annotations,
  • 21:11at
  • 21:12these
  • 21:14specific regions within a pathology
  • 21:16image that leads to leads
  • 21:17to a diagnosis. So that's
  • 21:18what we did to collect
  • 21:19this very large instruction dataset.
  • 21:22This instruction dataset was based
  • 21:23on,
  • 21:26some training material that we
  • 21:27had available at Brigham and
  • 21:28MGH, but also lots of
  • 21:30manual data curation and then
  • 21:32manual curation of guardrails that
  • 21:33we needed to build the
  • 21:34build the chatbot. So we
  • 21:36got to about nine hundred
  • 21:37and ninety nine thousand question
  • 21:38answer terms that were used
  • 21:40to train the, train the
  • 21:41chatbot.
  • 21:42And, eventually, we we we
  • 21:44had a chatbot that that
  • 21:45started to work quite well
  • 21:46where we could go into
  • 21:47pathology image, ask questions about
  • 21:49particular regions within the image,
  • 21:51and it would give it
  • 21:52would give a response. Like,
  • 21:53so so in in in
  • 21:54this example, the user is
  • 21:55basically asking,
  • 21:57you know, what's what what
  • 21:58what type of tumor do
  • 21:59you see? And as you
  • 22:00can see, that limited context
  • 22:02was given for this. The
  • 22:03the more context you give,
  • 22:04the better the response is
  • 22:05likely to be. And you
  • 22:06can continue to ask additional
  • 22:08additional questions and eventually,
  • 22:11ask it to write a
  • 22:12pathology pathology report. I think
  • 22:13a lot of people have
  • 22:14already seen this demo, so
  • 22:15I will I will skip
  • 22:17through it.
  • 22:18But once it has seen
  • 22:19enough, it can it can
  • 22:20write up a pathology report.
  • 22:21So what we hope is
  • 22:22that this would happen in
  • 22:22the background where the host
  • 22:24level analysis was is already
  • 22:25done and the report is
  • 22:27already generated by the time
  • 22:28a pathologist is looking at
  • 22:29it.
  • 22:30We're also interested in essentially
  • 22:32using this for lower source
  • 22:33settings where we have just
  • 22:34a cell phone coupled directly
  • 22:36to a microscope, taking multiple
  • 22:37images
  • 22:38from the from the microscope,
  • 22:40and then asking questions to
  • 22:42the
  • 22:42to to the chatbot about
  • 22:44what what the,
  • 22:46what what is essentially in
  • 22:47those in those images. The
  • 22:48more context, again, we we
  • 22:50give it, the better the
  • 22:51chatbot is likely likely to
  • 22:53do. So in this case,
  • 22:54the user is asking
  • 22:56that these images are from
  • 22:57a patient with a left
  • 22:58breast mass. What do you
  • 23:00what do you see? And
  • 23:00the model would come up
  • 23:01with a with a response.
  • 23:03And you can continue to
  • 23:04chat with the model, ask
  • 23:06about, you know, potential treatment
  • 23:08guideline or what next steps
  • 23:10of the diagnostic process,
  • 23:12would this case essentially go
  • 23:14through.
  • 23:16This example is is quite
  • 23:17interesting because we're using
  • 23:19coupled to a microscope. And
  • 23:19the user is is essentially
  • 23:21asking
  • 23:22what what
  • 23:33these images are from a
  • 23:33patient with a left breast
  • 23:35mass. What what do you
  • 23:36see? And, the chatbot
  • 23:38responds that, well, this is
  • 23:41likely
  • 23:42a a a melanoma.
  • 23:43And then the user can
  • 23:44ask,
  • 23:45what ancillary tests could be
  • 23:47used
  • 23:48to confirm,
  • 23:49this this particular case.
  • 23:52So what what IHC should
  • 23:53be, should I order to
  • 23:54confirm this?
  • 23:56And,
  • 23:57it can give suggestions
  • 23:58for the IHCs
  • 23:59that that can be used
  • 24:00to confirm this. Once those
  • 24:02ancillary tests are in, you
  • 24:04can image those ancillary the
  • 24:06the those slides again with
  • 24:08your with your microscope
  • 24:09and,
  • 24:11you skip that slide within
  • 24:13the same context. So this
  • 24:14is this is where the
  • 24:15sort of the innovation is
  • 24:16is is that you can
  • 24:17continue to ask questions within
  • 24:19the same context.
  • 24:20So, essentially, what this means
  • 24:21is is that at a
  • 24:22whole slide level, if you
  • 24:23have an h HNE,
  • 24:26the the model can predict
  • 24:27what ancillary which ancillary test
  • 24:29needs to need to be
  • 24:29ordered, order those tests on
  • 24:31its own, and then ingest
  • 24:33those images in the same
  • 24:34context and have a pathology
  • 24:35report ready by a time
  • 24:37a pathologist
  • 24:38essentially starts looking at it.
  • 24:39So that's what we are
  • 24:40working towards. But with the
  • 24:41with the model, obviously, we
  • 24:42needed to do a lot
  • 24:44of evaluation. So this was
  • 24:45done in sort of a
  • 24:47two two pronged strategy with
  • 24:48a lot of quantitative evaluation
  • 24:50with,
  • 24:51multiple choice questions and how
  • 24:53well the model does in
  • 24:54comparison to some of the
  • 24:55other models including g p
  • 24:56d four o or generically
  • 24:57trained models and and and
  • 24:58models that are trained specifically
  • 25:00on medical data.
  • 25:02But then also,
  • 25:03by comparing it with, with
  • 25:05pathologists and asking pathologists to
  • 25:07rank how well the response
  • 25:09is
  • 25:10or or or how well
  • 25:10the monitor does in terms
  • 25:11of the response in comparison
  • 25:13to some of the other
  • 25:14other models. And we overall,
  • 25:15we see that this model
  • 25:16that's specifically trained on lots
  • 25:18of pathology data does quite
  • 25:20well on diagnosis microscopy,
  • 25:23based questions, but does not
  • 25:24do very well on on
  • 25:26clinical questions. And the reason
  • 25:27largely is that we did
  • 25:28not train with lots of
  • 25:29lots and lots of medical
  • 25:30texts that that GPD four
  • 25:32o is largely trained on.
  • 25:35So I've seen
  • 25:36that you can build lots
  • 25:37of supervised models, self supervised
  • 25:39models for feature extraction,
  • 25:41contrast with other modalities to
  • 25:42get richer feature
  • 25:44richer features, and then use
  • 25:46those features in a generative
  • 25:47AI setting where you have
  • 25:48a singular model that can
  • 25:49harbor information
  • 25:51ideally around all of human
  • 25:52pathology or eventually around all
  • 25:54of human pathology.
  • 25:55But what can you do
  • 25:56with the generative AI and
  • 25:57all these features? So
  • 25:59the the next obvious step
  • 26:00is that you can build
  • 26:01an agent on top of
  • 26:02it.
  • 26:03So you have a bunch
  • 26:04of what if questions and
  • 26:05about and and what if
  • 26:06statements. Right? So what what
  • 26:08if agents could do all
  • 26:09the biomedical data analysis for
  • 26:10you? What if AI agents
  • 26:11could develop, assess, and explain
  • 26:13AI models for for pathology?
  • 26:16What if an AI agent
  • 26:17could write code, run experiments,
  • 26:19test hypotheses?
  • 26:20What if an AI agent
  • 26:21could continuously run-in the background
  • 26:23attempting to find common morphologic
  • 26:25features across patient cohorts and
  • 26:27correlate with outcome? Right? So
  • 26:29so these are
  • 26:30all questions we have because
  • 26:32we we were living in
  • 26:33in in sort of the
  • 26:34age of AI agents where,
  • 26:37these agents would do things
  • 26:38for us. So that's what
  • 26:39we essentially try to do.
  • 26:40We build we we built
  • 26:41the agent on top of,
  • 26:44the the generative AI model
  • 26:45that we had.
  • 26:47It uses all the code
  • 26:48that we had developed over
  • 26:49the past five years,
  • 26:51but it's capable of writing
  • 26:52additional code to patch through
  • 26:53multiple,
  • 26:55parts of the of of
  • 26:56the library that we that
  • 26:57we had. I'll show a
  • 26:59few example use cases. So
  • 27:01in this in this particular
  • 27:02use case, the user is
  • 27:03saying that,
  • 27:05can you help me train
  • 27:06a model to classify classify
  • 27:08between responders and and non
  • 27:09responders? And here's where my
  • 27:10data is stored. And this
  • 27:12is what the magnification it
  • 27:13was it was trained on.
  • 27:15So the journey of AI
  • 27:16aspect of this is that
  • 27:17is it it it can
  • 27:18generate a plan. And the
  • 27:20plan can then retrieve and,
  • 27:22use code that's already written
  • 27:24as well as writing some
  • 27:25additional code that might be
  • 27:26needed to pass through the
  • 27:28what what code is already
  • 27:28available.
  • 27:29So in in in computational
  • 27:30biology and largely in bioinformatics,
  • 27:32we're often using large libraries,
  • 27:34existing existing code to analyze
  • 27:36new kinds of data leading
  • 27:37to leading to discovery.
  • 27:40And there's
  • 27:41some code writing, but it's
  • 27:42often patching through code that's
  • 27:44already been been written. And
  • 27:46that's what's essentially happening on
  • 27:47its own here. So so
  • 27:48we use,
  • 27:50path chat to as a
  • 27:51morphologic descriptor,
  • 27:53and existing code that that's
  • 27:55used in retrieve. So the
  • 27:56model says that, well, here's
  • 27:57an ROC curve.
  • 27:59It used about a hundred
  • 28:00cases to train the model,
  • 28:01and the ROC is zero
  • 28:02point eight five five. And
  • 28:03the next question can be
  • 28:04that, well, can I look
  • 28:05at a heat map showing
  • 28:06what the high tension regions
  • 28:08are,
  • 28:10and and what what the
  • 28:11model is using to make
  • 28:12these classification determinations?
  • 28:14So you can you can
  • 28:15do that. And then we
  • 28:16can invoke past chat or
  • 28:18the generative AI model to
  • 28:19write a report about what
  • 28:20the model used in making
  • 28:22these classification determinations.
  • 28:24What were the high tension
  • 28:25regions? What were the low
  • 28:26tension regions?
  • 28:28And it can do that.
  • 28:28So it so basically says
  • 28:29that the the model used
  • 28:31inflammatory regions, necrosis, and fibrosis
  • 28:34to determine which cohort is
  • 28:35the responder versus which which
  • 28:37patient is the responder versus
  • 28:38nonresponder. Responder. And then you
  • 28:39might wanna do some more
  • 28:40fine grained analysis, like, segment
  • 28:42all the cells, classify them,
  • 28:44and then run handcrafted feature
  • 28:45analysis on top,
  • 28:46and it would do that
  • 28:47for you, for you as
  • 28:49well. So
  • 28:50the
  • 28:51the,
  • 28:52what's essentially being done is
  • 28:54that
  • 28:55the generative AI can parse
  • 28:56your text command, convert it
  • 28:58into or oppose it as
  • 28:59a machine learning problem,
  • 29:01then
  • 29:02recall
  • 29:03all the code that was
  • 29:04written potentially for other purposes
  • 29:05but has been now streamlined,
  • 29:07and write additional code where
  • 29:08it needs to be written,
  • 29:09to patch everything everything together.
  • 29:11Another example of this
  • 29:13is around case retrieval. So
  • 29:15the user is saying that
  • 29:16can you help me build
  • 29:17a database of slides that
  • 29:19I can use to query,
  • 29:21query my databases
  • 29:23or or query my large
  • 29:24database of of all host
  • 29:25side images. So it uses
  • 29:27the whole slide level foundation
  • 29:28model that we built to
  • 29:29extract a single feature representation
  • 29:30corresponding
  • 29:31each image and builds the
  • 29:33entire
  • 29:34entire database. Once the database
  • 29:36is is built, the user
  • 29:37can give a single slide
  • 29:38and say, can you find
  • 29:39common cases
  • 29:41the most common cases to
  • 29:42this this particular
  • 29:43this particular image?
  • 29:45So retrieve the the the
  • 29:47the top three most similar
  • 29:48cases to this this particular
  • 29:49image. And this these images
  • 29:51are from,
  • 29:52I believe, leiomyosarcoma.
  • 29:54So,
  • 29:55it would retrieve the three
  • 29:56most common images. This is
  • 29:58based on the TCGA.
  • 30:01And then you can just
  • 30:02introspect them.
  • 30:05And
  • 30:05the the the last example
  • 30:08is around multimodal data data
  • 30:10integration. So the user can
  • 30:11say that, well, train a
  • 30:12multimodal data integration,
  • 30:14model using something very basic
  • 30:16like the concat
  • 30:17functionality. So, basically, the way
  • 30:20this works or or how
  • 30:21our training data essentially work
  • 30:23is is that if you
  • 30:25the more specific you make
  • 30:26your prompt, it will use
  • 30:27all the information from your
  • 30:28prompt. If it's if if
  • 30:30that information is not there,
  • 30:31it would make some assumptions.
  • 30:33And if it can't make
  • 30:34make assumptions, for example, where
  • 30:35your data is located or
  • 30:36how your data is is
  • 30:37organized, it will ask additional
  • 30:39questions.
  • 30:40So,
  • 30:41in this particular case, it's
  • 30:42training a model,
  • 30:45to separate low grade glioma
  • 30:46cases by integrating histology,
  • 30:49molecular, and radiology data. So
  • 30:51it extract features, integrates them,
  • 30:53and comes up with this
  • 30:55with this, Kilometers curve showing
  • 30:56that, well, this is how
  • 30:57separable
  • 30:58the the patients are. The
  • 31:00next question is that that
  • 31:02can you help me introspect
  • 31:04the the genomics?
  • 31:05What molecular features is the
  • 31:07model using in making these
  • 31:08making these determinations?
  • 31:10So it it plots that
  • 31:12and continue to ask more
  • 31:13questions. Can I look at
  • 31:14the,
  • 31:16the heat map, or can
  • 31:17I look at the radiology,
  • 31:20image
  • 31:20and what was most important
  • 31:22in the radiology image? And
  • 31:23it would would essentially,
  • 31:26give you a heat map
  • 31:27of the of radiology
  • 31:29image. So
  • 31:31it it can do the
  • 31:32same for for pathology. So
  • 31:33I'll I'll
  • 31:35skip through in the interest
  • 31:37of time, but, there's there's
  • 31:38more information available about this,
  • 31:41and we will hopefully put
  • 31:42a preprint around the agent
  • 31:43out,
  • 31:45soon. It's it's,
  • 31:46it's it's been in the
  • 31:47works for about a year
  • 31:48and a half, but, it's
  • 31:50been a challenge to do
  • 31:51all the evaluations around this.
  • 31:55The next thing I wanna
  • 31:56talk about is is transitioning
  • 31:57from, two d to three
  • 31:59d pathology. So I think
  • 32:00that
  • 32:01everyone here would would acknowledge
  • 32:03that we need to have
  • 32:04some form of three d
  • 32:05pathology because
  • 32:06the the the tissue we
  • 32:07looked at look at is
  • 32:08a very small sample of
  • 32:09the actual
  • 32:11three-dimensional tissue. And there have
  • 32:12been multitude of studies showing
  • 32:13that if you look at,
  • 32:14you know, multiple sections,
  • 32:15entire volume,
  • 32:17the diagnosis can change,
  • 32:19and so forth. And there
  • 32:20are a number of different
  • 32:21technologies available for this now.
  • 32:23So we have OTLS, OTLS,
  • 32:24micro CT,
  • 32:25as well as, newer techniques
  • 32:26where you can take lots
  • 32:27of sections and use machine
  • 32:28learning to reconstruct the tissue
  • 32:30tissue like coda.
  • 32:33A key issue in the
  • 32:34adoption of these technologies is
  • 32:36that how would a pathologist
  • 32:37look at such a large
  • 32:38volume? It would take a
  • 32:39substantially,
  • 32:41large amount of time to
  • 32:43to to look at each
  • 32:43one of these volumes,
  • 32:45and how that would impact
  • 32:46oral care. So we wanted
  • 32:48to see is that is
  • 32:49it possible for us to
  • 32:49use machine learning to accelerate
  • 32:51this a little bit, at
  • 32:53least find regions within the
  • 32:55within the slide,
  • 32:57that,
  • 32:59that that a pathologist can
  • 33:00then look at. So we
  • 33:01studied this based on two
  • 33:02cohorts. One of them was
  • 33:03collected at Harvard,
  • 33:05and it it was scanned
  • 33:07using a micro CT scanner.
  • 33:10And that's this cohort. And
  • 33:11then there's another cohort that
  • 33:12came from Jonathan Liu's group,
  • 33:14at at the University of
  • 33:15Washington. But, basically, we developed
  • 33:17a,
  • 33:18a multi instance learning
  • 33:19based framework that was adopted
  • 33:21for three d three d
  • 33:22pathology. So much more compute
  • 33:23intensive,
  • 33:25using three d patches or
  • 33:26voxels instead of instead of
  • 33:27two dimensional patches, feature extraction
  • 33:29in in three d and
  • 33:30eventually eventually feature aggregation in
  • 33:32three d. And we did
  • 33:33quite a lot of analysis
  • 33:34around this, figuring out what
  • 33:35the best setup would be
  • 33:36when this large amount of
  • 33:37data would be would be
  • 33:38available. This this article was
  • 33:39published just a couple of
  • 33:41months ago.
  • 33:42And we showed that as
  • 33:43you use increased,
  • 33:45volume
  • 33:46of tissue, the model did
  • 33:48did
  • 33:49did better in terms of
  • 33:50separating patients into into distinct
  • 33:52risk risk groups.
  • 33:55And, of course, you can
  • 33:56then go in and look
  • 33:57at what's most important within
  • 33:58the within the entire
  • 34:00volume
  • 34:01of this, or or or
  • 34:02this entire three d three
  • 34:04d volume. We're continuing to
  • 34:05investigate this for other other
  • 34:07diseases, and we have a,
  • 34:11a recent large,
  • 34:12national effort,
  • 34:14to use this for precision
  • 34:16surgical interventions.
  • 34:22So the next thing I'll
  • 34:23quickly touch upon is some
  • 34:25of the new work we
  • 34:25are doing in trying to
  • 34:26do AI driven three d
  • 34:28spatial transcriptomics. Now a question
  • 34:29mark there because this is
  • 34:31relatively very new work, and
  • 34:33we're still not quite sure
  • 34:34what we're gonna find. But
  • 34:35I wanted to share some
  • 34:36of these early results,
  • 34:38here today.
  • 34:40So with the with the
  • 34:41micro CT scanning or with
  • 34:42the open top light sheet
  • 34:43microscopy scanning, we have very
  • 34:45nice volumes of
  • 34:47of of of tissue that
  • 34:48we can we can look
  • 34:48at. And there have been
  • 34:49a multitude of studies that
  • 34:51have shown that you can
  • 34:52predict
  • 34:53spatial,
  • 34:55spatial transcriptomics
  • 34:56from histology alone at least
  • 34:58to at least to a
  • 34:59degree. So we wanted to
  • 35:00see if we can leverage
  • 35:01that to to predict ST
  • 35:02in three d.
  • 35:04But,
  • 35:07of course, we can build
  • 35:08these models based on lots
  • 35:09of historical data, but we
  • 35:10all also also wanted to
  • 35:11see if it's possible for
  • 35:12us to do some inpatient
  • 35:14fine tuning. So a new
  • 35:16block, we image it with
  • 35:17CT. We do some do
  • 35:18do some spatial transcriptomics in
  • 35:20the top and bottom and
  • 35:21fine tune this existing network
  • 35:22that's probably been trained on
  • 35:23lots and lots of data
  • 35:25from the same tissue
  • 35:26and and and ST from
  • 35:27other patients.
  • 35:28And this inpatient fine tuning
  • 35:30could potentially then lead to
  • 35:31lead to better,
  • 35:33better interpolation in in three
  • 35:34dimensions. It's just a hypothesis,
  • 35:36and we wanted to see
  • 35:37if we could we could
  • 35:37test this. There are some
  • 35:39some initial results. We use
  • 35:40a number of different mechanisms
  • 35:41to see what the best
  • 35:42approach would be to to
  • 35:43predict,
  • 35:44spatial transcriptomics from from from
  • 35:46h and e.
  • 35:48And we tried
  • 35:49a a number of different
  • 35:50techniques, but contrasting between,
  • 35:53ST and h and e
  • 35:54seems to be, the best
  • 35:55approach. We built a predictor.
  • 35:56And then also incorporating some
  • 35:58depth information leads to improved
  • 36:00improved performance as well.
  • 36:02But eventually, we're able to
  • 36:04get to,
  • 36:05a point where we can
  • 36:07interpolate this in in three
  • 36:08dimension and also confirm that
  • 36:10that that this is correct
  • 36:11to a degree. We've expanded
  • 36:12this to three different disease
  • 36:13models. This this example is
  • 36:15on the prostate.
  • 36:16We've expanded it just expanded
  • 36:18this to three different disease
  • 36:19models as well as a
  • 36:20lot of additional data that
  • 36:21was available
  • 36:22with our collaborators at the
  • 36:24Broad and other places,
  • 36:26and are sort of continuing
  • 36:28to work in this,
  • 36:29in in in this direction.
  • 36:31The last thing I wanna
  • 36:32touch upon is is bias
  • 36:33and fairness in computational pathology
  • 36:35datasets. So,
  • 36:37a lot of computational data,
  • 36:38pathology data comes from large
  • 36:40academic medical centers, and the
  • 36:41data is not very diverse.
  • 36:42And and we have done
  • 36:44some analysis where we train
  • 36:45on on on large cohorts
  • 36:46of data that are commonly
  • 36:47used in model development like
  • 36:49the TCGA as well as
  • 36:50internal data. And what happens
  • 36:51when you adapt to,
  • 36:53data that is independent and
  • 36:55stratified by race or other
  • 36:56protected subgroups.
  • 36:58And that's what led to
  • 36:59this study where we it
  • 37:00was also published earlier this
  • 37:02year where we wanted to
  • 37:03investigate demographic shifts and misdiagnosis
  • 37:05by computational pathology
  • 37:07models.
  • 37:08Overall, the idea is that
  • 37:10is that,
  • 37:11when when you train a
  • 37:12model on a specific cohort
  • 37:13when and you transfer it
  • 37:14to,
  • 37:15to to new kinds of
  • 37:15data, it often does not
  • 37:17adapt. And this issue around
  • 37:18domain adaptation is is like
  • 37:20a age old problem, but
  • 37:21it's sort of exaggerated in
  • 37:22health care.
  • 37:24And there are large differences,
  • 37:25for example, in in how
  • 37:27the individual scanner behave.
  • 37:30This particular slide was scanned
  • 37:31on, like, a Haohomatsu scanner
  • 37:32and a period scanner and
  • 37:33three d stack scanner, leading
  • 37:35to a very different color
  • 37:36gamut. The same is true
  • 37:37in radiology radiology as well.
  • 37:39The datasets,
  • 37:40also have,
  • 37:42you know, they're they're very
  • 37:43consistent. They don't have a
  • 37:45lot of lot of diversity.
  • 37:46The,
  • 37:47point we really wanted to
  • 37:48investigate was that can we
  • 37:50go in and look at
  • 37:51each and every possible modeling
  • 37:53choice that people make,
  • 37:55when designing their computational pathology
  • 37:57setup or or their or
  • 37:59their training architecture,
  • 38:01and see how that impacts,
  • 38:04the outcome,
  • 38:06and the the the the
  • 38:07classification results when it's stratified
  • 38:10by some of these protected
  • 38:11protected subgroups. So we varied
  • 38:14some all the different preprocessing
  • 38:16techniques, the utility of
  • 38:18the
  • 38:19the the foundation model or
  • 38:20the self supervised model that
  • 38:22was that was used, as
  • 38:23well as some tricks that
  • 38:24are commonly used to,
  • 38:26improve fairness, like adversarial regularization
  • 38:29and so forth,
  • 38:30and other fairness fairness strategies.
  • 38:32And overall, we found that
  • 38:33the most important component is
  • 38:35that how rich your feature
  • 38:36features are.
  • 38:37And by using, only features
  • 38:39in this case, we're able
  • 38:40to improve performance
  • 38:41for a number of these,
  • 38:43different,
  • 38:45different classification examples that were
  • 38:47that were used in the
  • 38:48in the study. So I'll,
  • 38:50stop here in the interest
  • 38:51of time, but I do
  • 38:53wanna read from this poem
  • 38:54that Judith Prevett wrote. She
  • 38:56was one of the
  • 38:58pioneers in analyzing microscopy images
  • 39:00in in computers, some some
  • 39:01really pioneering work in the
  • 39:021960s and 70s. She writes
  • 39:04that optical illusions can deceive
  • 39:06the subjective eye, but objective
  • 39:08measurements and algorithms
  • 39:09are assumed not to lie.
  • 39:11It's often said that medicine
  • 39:12could use such objectivity and
  • 39:13thought that this justifies
  • 39:15machine intelligence activity. Artificial intelligence
  • 39:17is another craze that uses
  • 39:19computers to cope with the
  • 39:20diagnostic maze. Though the criteria
  • 39:22for intelligence has never been
  • 39:23resolved, paper after paper claims,
  • 39:25the problem has already been
  • 39:26solved. So we still have
  • 39:28a way to go before
  • 39:30we we address some of
  • 39:31these, critical issues,
  • 39:34that that that exist,
  • 39:35with developing
  • 39:36effective machine learning algorithms for
  • 39:38for pathology. And I'd like
  • 39:40to thank all the funding
  • 39:41that we received to do
  • 39:42this work,
  • 39:43as well as all the
  • 39:44PhD students, postdocs who have
  • 39:46worked worked in the lab.
  • 39:56Yes. Another question. So we
  • 39:58know that PR can have
  • 40:00hallucination.
  • 40:01Yeah. Always come up with
  • 40:02Hunter. What if the same
  • 40:03car make that Yeah. Policy
  • 40:05share Yeah.
  • 40:16Yeah. Well, we are trying
  • 40:17to get it to stop.
  • 40:18So so the multimodal large
  • 40:20language model that we built
  • 40:21that is that is prone
  • 40:22to hallucinations,
  • 40:24we built in a lot
  • 40:25of guardrails. So it would
  • 40:26stop giving a,
  • 40:28stop from making a diagnosis
  • 40:30if it's not sure.
  • 40:32Or, you know, if you
  • 40:33give it an image, a
  • 40:34model that's entirely trained on
  • 40:35pathology images and you give
  • 40:36it an image of a
  • 40:37cat, it would still say
  • 40:38maybe it's squamous cell carcinoma.
  • 40:41So so you don't want
  • 40:42it to do that. So
  • 40:44we build in guardrails that
  • 40:45that prevents hallucinations.
  • 40:47More data for training
  • 40:49prevents,
  • 40:50hallucinations.
  • 40:51Better pretraining data also prevents
  • 40:53hallucinations.
  • 40:54I
  • 40:56I actually seen that.
  • 41:14Never received the answers they
  • 41:16have. Yeah. Always come with
  • 41:18the.
  • 41:19Right? Yeah. Back to the
  • 41:21most question, but Yep. Okay.
  • 41:23Safeguard
  • 41:24to say that, you know,
  • 41:26particularly with your with your
  • 41:27past chat. Yeah. Do you
  • 41:29have, like, seen forever,
  • 41:30pets check and say, I
  • 41:31don't know that?
  • 41:34We are trying to build
  • 41:35in those guardrails. Right? So
  • 41:36so the the there's this
  • 41:37whole field of, study in
  • 41:39machine learning where you get
  • 41:41the model to abstain from
  • 41:42making predictions. Right? It's it's
  • 41:44ongoing research in that area
  • 41:45that how do you abstain
  • 41:46from making a prediction. If
  • 41:47you're not sure, how do
  • 41:49how does the model abstain?
  • 41:50So the the the latest
  • 41:51question for the patch check.
  • 41:53So how wide are they
  • 41:55available
  • 41:56Your patch check within your
  • 41:57department and Google. So what's
  • 41:59the kind of Yeah. We
  • 42:00we we have about a
  • 42:01hundred people using it. The
  • 42:03issue with making it available
  • 42:04too widely is that it
  • 42:05it's expensive. The deployment is
  • 42:06is expensive because it it
  • 42:08actively uses GPUs,
  • 42:11in the,
  • 42:12in the background.
  • 42:14But,
  • 42:15our group has spun off
  • 42:16a startup company that that
  • 42:17that plans to make it
  • 42:18more widely available.
  • 42:20And they're
  • 42:22are they're expanding on the
  • 42:24amount of data
  • 42:25that they're using for training
  • 42:26as well as evaluation.
  • 42:28So hopefully, over time, I
  • 42:30think it will it will
  • 42:30become available.
  • 43:03Yeah. Yeah. That's a that's
  • 43:04a that's a great great
  • 43:05question. There there are techniques
  • 43:07you can use to to
  • 43:08minimize the the batch effect.
  • 43:11The spatial transcriptomic results that
  • 43:13I showed,
  • 43:15it's
  • 43:15it's it's consistent. Like, so
  • 43:17it's it's from within the
  • 43:18same,
  • 43:19data collection pipeline to to
  • 43:21build a three d three
  • 43:22d model.
  • 43:23But, there's some other work
  • 43:24from my group,
  • 43:27the HEST benchmark and the
  • 43:29corresponding library. It has some
  • 43:31tools that you can use
  • 43:32to
  • 43:33to to to reduce the
  • 43:34batch effect.
  • 43:36That said, you probably cannot
  • 43:37eliminate the the batch effect
  • 43:39completely. Right? So you can
  • 43:41probably reduce the batch
  • 43:43effect that exists in,
  • 43:46the image
  • 43:47to a degree,
  • 43:48to a degree you can
  • 43:49if if it's it's something
  • 43:51that's consistent across all of
  • 43:52your data, you could you
  • 43:54could perhaps eliminate that. But
  • 43:56site specific batch effect, if
  • 43:57you have lots of spatial
  • 43:58transcriptomic data that you're bundling
  • 43:59bundling together to perhaps train
  • 44:01a contrastive model, very difficult
  • 44:03to do.
  • 45:03Yeah.
  • 45:05Yeah. So the so your
  • 45:06first question about the patch
  • 45:08size. Right? So,
  • 45:10there are a number of
  • 45:11different ways,
  • 45:13to think about this. The
  • 45:14the the first thing is
  • 45:15that the majority of computational
  • 45:17pathology studies, they work at
  • 45:18a single resolution. They patch
  • 45:20everything out at two five
  • 45:21six by two five six
  • 45:22images. And in the in
  • 45:24in the majority of cases,
  • 45:25they work work just fine
  • 45:27for whole slide of a
  • 45:28classification. And that's because
  • 45:30the morphologic feature that you're
  • 45:32trying to identify is very
  • 45:33clearly evident at a two
  • 45:34five six by two five
  • 45:35six patch. I've had pathologists
  • 45:36tell me that, oh, I
  • 45:37I can't identify something at
  • 45:38a two five six by
  • 45:39two five six patch. Why
  • 45:41is this working so well
  • 45:42even though each patch is
  • 45:43is mutually exclusive
  • 45:44and the model has no
  • 45:46has no context? And that's
  • 45:47perhaps because features are being,
  • 45:49being extracted and then they're
  • 45:51being aggregated so that that
  • 45:52aggregation
  • 45:53somehow counts for that.
  • 45:55That said, intuitively, it doesn't
  • 45:57make sense because
  • 45:59the
  • 46:00the the each patch is
  • 46:01still mutually exclusive, and they're
  • 46:03they're not linked together. There
  • 46:04have been a multitude of
  • 46:05studies using graphs and other
  • 46:07techniques to link the patches
  • 46:08together to improve context, but
  • 46:10they haven't shown a substantial
  • 46:11improvement
  • 46:12in, in performance.
  • 46:14That's that's one aspect. The
  • 46:15other aspect is that the
  • 46:16field in general is moving
  • 46:18to, like, resolution agnostic, whole
  • 46:19slide level, fully context aware
  • 46:22aware models where,
  • 46:24we would have singular feature
  • 46:25representations corresponding the whole slide
  • 46:28still capable of doing whole
  • 46:29slide level whole slide level
  • 46:31tasks.
  • 46:32Now the I think your
  • 46:33question particularly refers to if
  • 46:34you have smaller patches, are
  • 46:36you able to do something
  • 46:36very fine fine grained? So
  • 46:38that's true. So if you
  • 46:39have smaller patches,
  • 46:40you're able to, you know,
  • 46:41separate out tells and and
  • 46:43do do specific things. So
  • 46:44so what would happen in
  • 46:45the long run? I think
  • 46:46it would be a combination
  • 46:48of the two. So we'll
  • 46:49we'll have, like, a whole
  • 46:49slide level feature vector that
  • 46:51can be used for other
  • 46:52downstream tasks, and you have,
  • 46:54like, smaller patches that can
  • 46:55be used for other kind
  • 46:57of more localized region specific
  • 46:59region specific tasks. What was
  • 47:00your Special state. Uh-huh. Special
  • 47:02state. Yes. So the,
  • 47:05yeah, majority of the work
  • 47:06is based on
  • 47:08based on HNE. I completely
  • 47:09agree that, you know, as
  • 47:10as you have more of
  • 47:11this
  • 47:12complimentary data, whether it's through
  • 47:14special stains or administered chemistry,
  • 47:16you can extract more information
  • 47:17from these from these images.
  • 47:19From the HNE already, we
  • 47:20can we can begin to
  • 47:21extract more information than than
  • 47:23than what we can see.
  • 47:24So with by having special
  • 47:25stains, you can extract even
  • 47:27even more.
  • 47:29So in in in the
  • 47:30short term, our goal is
  • 47:32to see
  • 47:34how you're using PathChat. How
  • 47:36can we go,
  • 47:38from h and e to
  • 47:39predicting what special stains or
  • 47:40or aminos to chemistry or
  • 47:42other ancillary test need need
  • 47:43to be ordered, ingest them
  • 47:44into the same context,
  • 47:46and see if the model
  • 47:47can get to get to
  • 47:48a pathology report.
  • 47:51It's it's a little bit
  • 47:52difficult to go to, like,
  • 47:53oral outcome using some of
  • 47:55these special stains if there's
  • 47:56not large enough cohorts available.
  • 47:58But if they're available, that's
  • 47:59very much possible.
  • 48:00Just to give an example
  • 48:01at the bottom of the
  • 48:02special stain.
  • 48:03We use Brightcove for five
  • 48:05percent study. Just generally, for
  • 48:06us, it's five percent for
  • 48:07five percent and light and
  • 48:09dark end of the aesthetic.
  • 48:10But using, this algorithm,
  • 48:14one of the AI models
  • 48:16was working
  • 48:33Mhmm. So that's what I'm
  • 48:34talking about. We really don't
  • 48:35use it that way, but
  • 48:36Yeah. Maybe the AI can't
  • 48:38figure out those set up
  • 48:39on your side of us.
  • 48:40Yeah. They they they possibly
  • 48:41used handcrafted features, like, four
  • 48:43hundred different handcrafted features and
  • 48:44then correlate them with the
  • 48:46with outcome. You could potentially
  • 48:47also do that directly, right,
  • 48:49using,
  • 48:50like, deep features.
  • 48:52Yeah.
  • 48:56Question over here.
  • 48:58Yes.
  • 49:01Go ahead.
  • 49:03So pathologists have, like,
  • 49:05basic textbooks that say this
  • 49:07is this disease and this
  • 49:08is not this disease. Have
  • 49:09you gone back to the
  • 49:10features that your networks are
  • 49:12picking out and say, we're
  • 49:13picking out the same features
  • 49:14that sort of biologists are
  • 49:15using for characterizing these diseases.
  • 49:18You know, you showed some
  • 49:19key maps that kinda show
  • 49:20broad regions. But Yeah. Certainly,
  • 49:23some disease are defined by
  • 49:25that mnemonic cell types versus
  • 49:27some.
  • 49:29Yeah. Yeah. So a lot
  • 49:30of the disease specific work
  • 49:31that we have done, we
  • 49:32we did do that. We,
  • 49:34asked pathologists to look at
  • 49:36the high tension regions,
  • 49:39and just narrate what they
  • 49:41were seeing.
  • 49:43And we have some of
  • 49:43that analysis. In in in
  • 49:45a few studies, we also
  • 49:46did quantitative analysis on top
  • 49:47of it. So high attention
  • 49:49regions, and then you basically
  • 49:51quantify all the
  • 49:53all the cells, classify them,
  • 49:55extract handcrafted features, and correlate
  • 49:57those handcrafted features with the
  • 49:59deep the the the the
  • 50:00regions that the deep model
  • 50:02was essentially using
  • 50:03to get a more quantitative
  • 50:04assessment of that. Because this
  • 50:05was done for, like, cancer
  • 50:07of unknown primary thing or
  • 50:08also for the cardiac,
  • 50:10allograft biopsy study.
  • 50:13It was done for a
  • 50:13number of studies. Yeah.
  • 50:15Scale of those regions, is
  • 50:16it,
  • 50:18the cell level scale? Was
  • 50:19it sort of a larger,
  • 50:22like, tens of cells, hundreds
  • 50:23of cells? Yeah. It depends
  • 50:25on the study. So,
  • 50:27for example, for the cancer
  • 50:28found on primary study, we
  • 50:29were able to do that
  • 50:30at a cell level and
  • 50:31then get a quantitative assessment
  • 50:33that the model is predominantly
  • 50:34looking at at tumor regions
  • 50:35and then what what handcrafted
  • 50:36features were being used. Yeah.
  • 50:41Yeah.
  • 50:47Yes.
  • 50:52Yes.
  • 50:56Trading and our councils that
  • 50:58are
  • 50:59very rare. Yes. Yes. Every
  • 51:01video. Right? Right. So what
  • 51:03what is the future of
  • 51:04those concepts in terms of
  • 51:07Yeah. Yeah. So
  • 51:09we try so so so
  • 51:10the,
  • 51:13there there are a number
  • 51:13of different datasets used in
  • 51:14in in this process. Right?
  • 51:16So the for the for
  • 51:16the pre training data, we're
  • 51:18just using images. There's a
  • 51:19huge disparity.
  • 51:21But for the instruction dataset,
  • 51:22there isn't a huge disparity
  • 51:23because we try to maximize
  • 51:24for diversity, and it's very
  • 51:26difficult to collect collect that
  • 51:27data.
  • 51:29But
  • 51:30the the performance is obviously
  • 51:31much better on common,
  • 51:34common disease entities and is
  • 51:35not so much,
  • 51:37not not so much so
  • 51:38on the rarer entities.
  • 51:41It's
  • 51:41it it it's possible
  • 51:44to use few shot learning
  • 51:45and everything that we're doing
  • 51:46in terms of which which
  • 51:47feature representation to improve performance
  • 51:49of these on these entities.
  • 51:51The issue is that it's
  • 51:52very difficult to validate it.
  • 51:55Would you trust an algorithm
  • 51:56that was trained on two,
  • 51:58two images and was evaluated
  • 52:00on five? Right. So so
  • 52:02so the issue is not
  • 52:03around. So so I trust
  • 52:04few shot learning because I
  • 52:05can see it works so
  • 52:06well, and it's in line
  • 52:07with what the rest of
  • 52:08the machine community is thinking
  • 52:10and every other field.
  • 52:12So it should work here
  • 52:12too. But the reason I
  • 52:15would not trust it is
  • 52:16because there isn't enough data
  • 52:17to validate it. So if
  • 52:19if a rare entity has
  • 52:20ten cases, twelve cases,
  • 52:22and and the model is
  • 52:23fine on all all twelve
  • 52:24of them, should we now
  • 52:25trust it or
  • 52:26or should we still get
  • 52:27more data? So,
  • 52:29the issue is around validation
  • 52:31for rare diseases.
  • 52:33One more question. You showed
  • 52:34us in this last part
  • 52:34of the the talk is
  • 52:51Yes. Of the cells. Yeah.
  • 52:54Yeah. So we we
  • 52:55have tried a number of
  • 52:56different approaches, including cyclic approaches.
  • 52:59So cyclic approach is very
  • 53:00common in computer science where
  • 53:01you go from one,
  • 53:03modality to the other and
  • 53:04then from the other modality
  • 53:05back, right, and hope that
  • 53:07this cyclic approach would would
  • 53:08would approve it. So it
  • 53:09is possible to make that
  • 53:11prediction. In particular, if you
  • 53:12use
  • 53:13a self supervised model that
  • 53:14that
  • 53:15is used to predicting pathology
  • 53:17images, like, just just just
  • 53:19predicting missing patches or something.
  • 53:21So it's possible to go
  • 53:22the other way as well.
  • 53:24Yeah.
  • 53:26Yes.
  • 53:27Two part.
  • 53:28Looking over, like,
  • 53:30one thing.
  • 53:32These, you know, images,
  • 53:34these large image databases are
  • 53:36based
  • 53:37on years off decades worth
  • 53:39of
  • 53:40of of data with associated
  • 53:41diagnoses.
  • 53:42But anyone who's got a
  • 53:43valve for more than five
  • 53:44years realizes that
  • 53:46that our diagnostics,
  • 53:48you know, change over time.
  • 53:50Right? And so so what
  • 53:51you know, in years of
  • 53:52quality, we've been called bleeding
  • 53:53by recipients. It doesn't
  • 53:55exist in. Yeah. So so
  • 53:57how do
  • 53:58you how do you expunge
  • 54:00those sort of Yeah.
  • 54:02Things from from the
  • 54:03train models Right. Is. Yeah.
  • 54:07So I'll I'll give that
  • 54:08person a second. Right. That
  • 54:09that that's a really good
  • 54:10question. And we had a
  • 54:12huge problem with this when
  • 54:13we were con constructing the
  • 54:14instruction dataset.
  • 54:17The the
  • 54:19the the solution we came
  • 54:20up with was, obviously,
  • 54:21to just manually evaluate each
  • 54:22and every, data training point.
  • 54:25But since then,
  • 54:27we have come up with,
  • 54:28like,
  • 54:29equivalency map for what what
  • 54:31a certain entity used to
  • 54:32be called and how it
  • 54:33merged.
  • 54:34That has helped us clean
  • 54:35up a lot of the,
  • 54:37old data. And we're also
  • 54:39actively thinking about what would
  • 54:40happen in the future because,
  • 54:41you know, you have a
  • 54:42multimodal large language model, which
  • 54:44will be when we spent
  • 54:45a significant amount of computational
  • 54:47resource to train it, we
  • 54:48don't wanna train it when
  • 54:50there's a new blue book.
  • 54:51Right? So so,
  • 54:53we're looking into, like, retrieval
  • 54:54augmented generation, other techniques that
  • 54:56can be used,
  • 54:58and leverage to to update
  • 55:00some of those diagnoses.
  • 55:01Okay. And then the other
  • 55:03one about you talked briefly
  • 55:04about the biases
  • 55:05in the model,
  • 55:06up against certain, you know,
  • 55:08patient populations,
  • 55:09you know, ethnic gender, those
  • 55:11sort of things. And and
  • 55:12I know that there's a
  • 55:13lot of of work that
  • 55:14I'm trying to, like okay.
  • 55:16You recognize that there's a
  • 55:17a bias in the model,
  • 55:18and so you try to
  • 55:19tweak
  • 55:20the model a little bit
  • 55:22to remove some of that
  • 55:23bias.
  • 55:25And what I've always been
  • 55:26surprised about is why do
  • 55:27people simply not pursue a
  • 55:30different course of action to
  • 55:31say that, you know, for
  • 55:32example, we need a different
  • 55:33model for men than we
  • 55:34do for women,
  • 55:36rather than trying to come
  • 55:37up with one model that
  • 55:38works for both.
  • 55:40And I wonder what the
  • 55:41the base is. Right. That's
  • 55:43a great question. I
  • 55:45I I I think a
  • 55:46lot of it has to
  • 55:47do with data.
  • 55:49How much data is available,
  • 55:51for this?
  • 55:53Would a model if there's
  • 55:55disparity between men and women
  • 55:57for a model,
  • 55:59will training separate models
  • 56:01help? It could be.
  • 56:04But
  • 56:05if there are no known
  • 56:06morphologic differences,
  • 56:09there are there's a high
  • 56:10chance that there could be
  • 56:11some other reason. And let
  • 56:12me give you an example.
  • 56:14So we
  • 56:16looked at, you know, a
  • 56:17a lot of these models,
  • 56:18very fundamental tasks. Can you
  • 56:20subtype, you know, breast carcinoma?
  • 56:22And can you subtype non
  • 56:23small cell lung cancer?
  • 56:24These are these are tasks
  • 56:26that have been essentially solved
  • 56:27by machine learning, where you
  • 56:28can get
  • 56:29models with a zero point
  • 56:30nine nine AUC, and they're
  • 56:32perfect models.
  • 56:34And then we apply them
  • 56:36to
  • 56:37data from MGH and the
  • 56:39and the Brigham and stratified
  • 56:41by, you know,
  • 56:43protected subgroups. And we you
  • 56:45you you you find that
  • 56:46there are these
  • 56:47large disparities across some of
  • 56:49these groups.
  • 56:50And,
  • 56:51then you start looking deeper.
  • 56:52Like, why is this happening?
  • 56:53And
  • 56:55is is there something else,
  • 56:56some other confounding variable that's
  • 56:58contributing to this? And you
  • 56:59suddenly find out that, well,
  • 57:01patients who don't have,
  • 57:04insurance tend to get diagnosed
  • 57:05late. And there are not
  • 57:07enough advanced cases in your
  • 57:08training set,
  • 57:10because
  • 57:11the the training data came
  • 57:12from a medical center where
  • 57:14patients, you know, was in
  • 57:15a region where most patients
  • 57:17had insurance and were diagnosed
  • 57:19early. And, there are lots
  • 57:20of,
  • 57:21early cases in that in
  • 57:22that cohort, but not enough
  • 57:23advanced cases.
  • 57:25This is just one example.
  • 57:26So often the confounding reason
  • 57:28for why these why these
  • 57:29disparities exist is just completely,
  • 57:31completely different.
  • 57:32It might be interesting to
  • 57:33do the other way around
  • 57:34and see if
  • 57:35you could show the the
  • 57:37bowel cone cancer and have
  • 57:38it predict whether it was
  • 57:39from the Right. Example. I
  • 57:41didn't use Yeah. To see
  • 57:42if there are effects. Right.
  • 57:44There there there there are
  • 57:45a number of studies that
  • 57:46have shown this where they've
  • 57:47shown that, well, can you
  • 57:49predict,
  • 57:50whether it's a man or
  • 57:50a woman? Or or also,
  • 57:52can you can you predict
  • 57:53the, the race or or
  • 57:55other protected subgroups directly
  • 57:57from the from the image?
  • 57:58Yeah.
  • 58:01Can I ask a question?
  • 58:02I don't know if you
  • 58:02can hear me.
  • 58:04Yes. Yes. It's David Klimstra.
  • 58:08Great talk, by the way.
  • 58:10You know, one of the
  • 58:11big,
  • 58:12opportunities, I guess, in diagnostic
  • 58:14AI
  • 58:15is to objectify
  • 58:17subjective
  • 58:18diagnoses like grading
  • 58:20tumors,
  • 58:21grading dysplasia, etcetera, etcetera. But
  • 58:23of course, because they're subjective,
  • 58:25the ground truth
  • 58:27in your dataset is also
  • 58:28going to be subjective. Do
  • 58:29you have any experience trying
  • 58:31to resolve that
  • 58:33problem?
  • 58:34Yeah.
  • 58:36Yeah, we have we have
  • 58:38looked into it, quite a
  • 58:39lot.
  • 58:41The the most obvious answer
  • 58:43is that is that we
  • 58:45are,
  • 58:46essentially trying to I mean,
  • 58:48I mean, these are
  • 58:49continuous biological processes that we
  • 58:51have discretized into
  • 58:53into these,
  • 58:54you know, diagnostic bins.
  • 58:56And if,
  • 58:59if if we want to
  • 59:00stick to those diagnostic
  • 59:01bins,
  • 59:03and there's disparity
  • 59:04or subjectivity in the in
  • 59:05the diagnosis and there might
  • 59:07be some erroneous diagnosis in
  • 59:08there,
  • 59:09deep learning is a great
  • 59:11solution for this because it's,
  • 59:13massively robust to label noise.
  • 59:16There there are studies in,
  • 59:18in machine learning showing that
  • 59:19even if you have twenty
  • 59:20percent of your labels corrupted,
  • 59:22you still get a classifier
  • 59:23that's almost perfect because
  • 59:25the
  • 59:27it it can just figure
  • 59:28out on its own that
  • 59:29what the most common,
  • 59:31features are across across these
  • 59:33across these images. And if
  • 59:35there are outliers, it doesn't
  • 59:36adhere to those outliers. It
  • 59:37would fit to the data
  • 59:39that is most most,
  • 59:41most common.
  • 59:42But,
  • 59:44now,
  • 59:45more recently, with all these,
  • 59:47like, foundation models and,
  • 59:50rich feature extraction,
  • 59:51there's an argument to be
  • 59:53made that if we can
  • 59:54get whole slide level,
  • 59:58feature representation
  • 59:59directly from the from the
  • 01:00:01slide, we can begin to
  • 01:00:02predict some of these outliers
  • 01:00:04directly and immediately without any
  • 01:00:06supervised training
  • 01:00:08and have them re reevaluated,
  • 01:00:10before you have a have
  • 01:00:11a supervised model,
  • 01:00:13model trained. But we haven't
  • 01:00:15done that yet. That's a
  • 01:00:15that's an idea. That's in
  • 01:00:17the works.
  • 01:00:18Very interesting. Thanks.
  • 01:00:31Yeah. So we we did
  • 01:00:33some work on transplants.
  • 01:00:48So
  • 01:00:55Yeah. Absolutely. We're we're very
  • 01:00:57interested. We we have some
  • 01:00:58ongoing projects including for.
  • 01:01:01We're we're definitely interested. Yeah.
  • 01:01:04Yeah.
  • 01:01:32And, you know,
  • 01:01:34and and also a further
  • 01:01:35into that question is, have
  • 01:01:37you guys looked at using
  • 01:01:39any of the AI Google
  • 01:01:40C
  • 01:01:41as a follow-up agent safety,
  • 01:01:50description and correlate to make
  • 01:01:51sure that everything
  • 01:01:53that we'll see on the
  • 01:01:54description is being represented in
  • 01:01:56your slides.
  • 01:01:58Right.
  • 01:01:59That's kinda long. Okay. So
  • 01:02:01so for your,
  • 01:02:04for your second question, no.
  • 01:02:06We we we have not
  • 01:02:07done the
  • 01:02:10the in in in terms
  • 01:02:11of the quality,
  • 01:02:13quality improvement. That is a
  • 01:02:14great great idea, though.
  • 01:02:16I I I think machine
  • 01:02:17learning can be used to
  • 01:02:18do that.
  • 01:02:20For your first question,
  • 01:02:23we
  • 01:02:23are actively working on,
  • 01:02:26you know, improving report synthesis.
  • 01:02:28So far, what we have
  • 01:02:29is that it's able to
  • 01:02:31generate morphologic descriptions from an
  • 01:02:33image and,
  • 01:02:35come up with a diagnosis.
  • 01:02:36More recently, with the context
  • 01:02:39expansion and some some work
  • 01:02:41that we're
  • 01:02:42doing with others is is
  • 01:02:43is just trying to see
  • 01:02:45if within the same context,
  • 01:02:46it can ingest
  • 01:02:48the initial image as well
  • 01:02:49as any ancillary test that
  • 01:02:50might need to be ordered
  • 01:02:52leading up to a pathology
  • 01:02:53report and the ability to
  • 01:02:54then go back and for
  • 01:02:55her pathologist to correct anything
  • 01:02:57that needs to be corrected
  • 01:02:58and for the report to
  • 01:02:59update based on on that.
  • 01:03:02But report synthesis in the
  • 01:03:04true sense where it can
  • 01:03:05actually be used is a
  • 01:03:06very hard problem.
  • 01:03:07And it's a hard problem
  • 01:03:08both in terms of having
  • 01:03:10enough data
  • 01:03:11to because you can't just
  • 01:03:13use
  • 01:03:15the existing pathology images and
  • 01:03:16reports to do this. You
  • 01:03:18need fine grained morphologic description
  • 01:03:21at
  • 01:03:22at at regions,
  • 01:03:24and,
  • 01:03:26and and with evaluation. Right?
  • 01:03:27So it's it's very difficult
  • 01:03:28to evaluate as well. Yeah.
  • 01:03:31I think we have five
  • 01:03:32minutes over. Thank you for
  • 01:03:33a great talk. Yeah. You
  • 01:03:33can see.