Skip to Main Content

YSPH Biostatistics Seminar: "Three Challenges Confronting Spatiotemporal Hawkes Models"

September 28, 2021
  • 00:00<v Man>Good afternoon, everybody.</v>
  • 00:02Good morning, Professor Holbrook.
  • 00:05Today I'm honored to introduce Professor Andrew Holbrook.
  • 00:08So professor Holbrook earned his bachelor's from UC Berkeley
  • 00:11and a statistics masters and PhD from UC Irvine.
  • 00:15His research touches a number of areas
  • 00:17of biomedical interests,
  • 00:18including Alzheimer's and epidemiology.
  • 00:22He's currently an assistant professor
  • 00:24of biostatistics at UCLA, where he teaches their advanced
  • 00:27basic computer course.
  • 00:29And he's the author of several pieces
  • 00:30of scientific software.
  • 00:32All of it, I think, is he's very fond of parallelization,
  • 00:37and he also has a package including one on studying
  • 00:40Hawkes processes, which he's going to tell us...
  • 00:44Well, he's gonna tell us about the biological phenomenon
  • 00:46and what's going on today.
  • 00:48So Professor Holbrook, thank you so much.
  • 00:52<v ->Okay, great.</v>
  • 00:53Thank you so much for the kind invitation,
  • 00:57and thanks for having me this morning slash afternoon.
  • 01:02So today I'm actually gonna be kind of trying to present
  • 01:06more of a high level talk that's gonna just focus on
  • 01:10a couple of different problems that have
  • 01:14come up when modeling Hawkes processes
  • 01:18for public health data, and in particular
  • 01:21for large scale public health data.
  • 01:24So, today I'm interested in spatiotemporal data
  • 01:28in public health, and this can take a number
  • 01:30of different forms.
  • 01:33So a great example of this is in Washington D.C.
  • 01:39Here, I've got about 4,000 gunshots.
  • 01:42You'll see this figure again,
  • 01:44and I'll explain the colors to you
  • 01:46and everything like that.
  • 01:49But I just want you to see that in the year 2018 alone,
  • 01:53there were 4,000 gunshots recorded in Washington DC.
  • 01:57And this is just one example of really a gun violence
  • 02:01problem in the U S of epidemic proportions.
  • 02:07But spatiotemporal public health data
  • 02:10can take on many forms.
  • 02:11So here, for example, I have almost almost 3000 wildfires
  • 02:18in Alaska between the years, 2015 and 2019.
  • 02:24And this is actually just one piece of a larger
  • 02:30trend that's going on in the American west.
  • 02:35And then finally, another example spatiotemporal public
  • 02:39health data is, and I believe that we don't need to spend
  • 02:44too much time on this motivation,
  • 02:46but it's the global spread of viruses.
  • 02:48So for example, here, I've got 5,000 influenza cases
  • 02:52recorded throughout, through 2000 to 2012.
  • 02:58So if I want to model this data,
  • 03:00what I'm doing is I'm modeling event data.
  • 03:02And one of the classic models for doing so
  • 03:06is really the canonical stochastic process here,
  • 03:12in this context is, is the Poisson process.
  • 03:14And I hope that you'll bear with me if we do just a little
  • 03:18bit of review for our probability 101.
  • 03:21But we say that accounting process
  • 03:24is a homogeneous Poisson process, point process
  • 03:28with rate parameter, excuse me, parameter lambda,
  • 03:32which is greater than zero.
  • 03:34If this process is always equal to zero at zero,
  • 03:39if it's independent increments, excuse me,
  • 03:43if it's increment over non-overlapping intervals
  • 03:48are independent random variables.
  • 03:50And then finally, if it's increments
  • 03:52are Poisson distributed with mean given
  • 03:57by that rate parameter lambda,
  • 04:00and then the difference in the times.
  • 04:04So we can make this model
  • 04:07just a very little bit more complex.
  • 04:09We can create an inhomogeneous Poisson point process,
  • 04:13simply by saying that that rate parameter
  • 04:16is no longer fixed, but itself is a function
  • 04:20over the positive real line.
  • 04:22And here everything is the exact same,
  • 04:24except now we're saying that it's increments,
  • 04:28it's differences over two different time periods
  • 04:30are Poisson distributed, where now the mean is simply given
  • 04:35by the definite integral over that interval.
  • 04:40So we just integrate that rate function.
  • 04:44Okay.
  • 04:45So then how do we choose our rate function for the problems
  • 04:48that we're interested in?
  • 04:49Well, if we return to say the gun violence example,
  • 04:53then it is plausible that at least sometimes some gun
  • 04:58violence might precipitate more gun violence.
  • 05:03So here we would say that having observed an event,
  • 05:09having observed gunshots at a certain location
  • 05:12at a certain time, we might expect that the probability
  • 05:15of observing gunshots nearby and soon after is elevated,
  • 05:23and the same could plausibly go for wildfires as well.
  • 05:28It's that having observed a wildfire in a certain location,
  • 05:33this could directly contribute to the existence
  • 05:39or to the observation of other wildfires.
  • 05:42So for example, this could happen by natural means.
  • 05:45So we could have embers that are blown by the wind,
  • 05:51or there could be a human that is in fact
  • 05:54causing these wildfires, which is also quite common.
  • 06:01And then it's not a stretch at all
  • 06:03to believe that viral observation,
  • 06:08so a child sick with influenza could precipitate
  • 06:12another child that becomes sick with influenza
  • 06:16in the same classroom and perhaps on the next day.
  • 06:23So then, the solution to building this kind of dynamic into
  • 06:27an in homogeneous Poisson process is simply to craft
  • 06:33the rate function in a way that is asymmetric in time.
  • 06:37So here is just a regular temporal Hawkes process.
  • 06:43And what we do is we divide this rate function, lambda T,
  • 06:48which I'm showing you in the bottom of the equation,
  • 06:51into a background portion which is here.
  • 06:55I denote nu, and this nu can be a function itself.
  • 07:00And then we also have this self excitatory component C of T.
  • 07:04And this self excitatory component for time T,
  • 07:08it depends exclusively on observations
  • 07:13that occur before time T.
  • 07:17So each tn, where tn is less than T,
  • 07:22are able to contribute information
  • 07:25in some way to this process.
  • 07:29And typically G is our triggering function.
  • 07:32G is non increasing.
  • 07:37And then the only other thing that we ask
  • 07:40is that the different events contribute
  • 07:42in an additive manner to the rate.
  • 07:45So here, we've got the background rate in this picture,
  • 07:49We have observation T1.
  • 07:50The rate increases.
  • 07:53It slowly decreases.
  • 07:55We have another observation, the rate increases.
  • 07:57And what you see is actually that after T1,
  • 08:00we have a nice little bit of self excitation as it's termed,
  • 08:04where we observe more observations.
  • 08:09This model itself can be made just a little bit more complex
  • 08:13if we add a spatial component.
  • 08:14So here now, is the spatiotemporal Hawkes process
  • 08:18where I'm simply showing you the background process,
  • 08:22which now I'm allowing to be described
  • 08:26by a rate function over space.
  • 08:29And then, we also have the self excitatory component,
  • 08:32which again, although it also involves
  • 08:35a spatial component in it,
  • 08:37it still has this asymmetry in time.
  • 08:40So in this picture, we have these,
  • 08:42what are often called immigrant events
  • 08:44or parent events in black.
  • 08:49And then we have the child events,
  • 08:50the offspring from these events described in blue.
  • 08:53So this appears to a pretty good stochastic process model,
  • 08:58which is not overly complex, but is simply complex enough
  • 09:02to capture contagion dynamics.
  • 09:08So for this talk, I'm gonna be talking about some major
  • 09:11challenges that are confronting the really data analysis
  • 09:17using the Hawkes process.
  • 09:19So very applied in nature, and these challenges persist
  • 09:23despite the use of a very simple model.
  • 09:26So basically, all the models that I'm showing you today
  • 09:29are variations on this extremely simple model,
  • 09:33as far as the Hawkes process literature goes.
  • 09:35So we assume an exponential decay triggering function.
  • 09:40So here in this self excitatory component,
  • 09:42what this looks like is the triggering function
  • 09:47is simply the exponentiation of negative omega,
  • 09:52where one over omega is some sort of length scale.
  • 09:56And then we've got T minus tn.
  • 09:58Again, that difference between a T
  • 10:01and preceding event times.
  • 10:05And then we're also assuming Gaussian kernel
  • 10:07spatial smoothers, very simple.
  • 10:09And then finally, another simplifying assumption
  • 10:12that we're making is separability.
  • 10:14So, in these individual components of the rate function,
  • 10:20we always have separation between the temporal component.
  • 10:25So here on the left, and then the spatial component
  • 10:28on the right, and this is a simplifying assumption.
  • 10:34So what are the challenges that I'm gonna present today?
  • 10:37The first challenge is big data because when we are modeling
  • 10:42many events, what we see is the computational complexity
  • 10:46of actually carrying out inference,
  • 10:51whether using maximum likelihood or using say,
  • 10:54Markov chain Monte Carlo,
  • 10:56well, that's actually gonna explode quickly,
  • 10:57the computational complexity.
  • 10:59Something else is the spatial data precision.
  • 11:02And this is actually related to big data.
  • 11:06As we accrue more data,
  • 11:08it's harder to guarantee data quality,
  • 11:11but then also the tools that I'm gonna offer up to actually
  • 11:15deal with poor spatial data precision are actually
  • 11:18gonna also suffer under a big data setting.
  • 11:21And then finally, big models.
  • 11:24So, you know, when we're trying to draw very specific
  • 11:27scientific conclusions from our model, then what happens?
  • 11:31And all these data, excuse me,
  • 11:33all these challenges are intertwined,
  • 11:34and I'll try to express that.
  • 11:39Finally today, I am interested in scientifically
  • 11:43interpretable inference.
  • 11:46So, I'm not gonna talk about prediction,
  • 11:48but if you have questions about prediction,
  • 11:51then we can talk about that afterward.
  • 11:53I'm happy too.
  • 11:57Okay.
  • 11:58So I've shown you this figure before,
  • 12:00and it's not the last time that you'll see it.
  • 12:02But again, this is 4,000 gunshots in 2018.
  • 12:05This is part of a larger dataset that's made available
  • 12:07by the Washington DC Police Department.
  • 12:12And in fact, from 2006 to 2018,
  • 12:15we have over 85,000 potential gunshots recorded.
  • 12:20How are they recorded?
  • 12:21They're recorded using the help of an acoustic gunshot
  • 12:24locator system that uses the actual acoustics
  • 12:28to triangulate the time and the location
  • 12:32of the individual gunshots.
  • 12:35So in a 2018 paper, Charles Loeffler and Seth Flaxman,
  • 12:40they used a subset of this data in a paper entitled
  • 12:44"Is Gun Violence Contagious?"
  • 12:46And they in fact apply to Hawkes process model
  • 12:49to try to determine their question,
  • 12:51the answer to their question.
  • 12:53But in order to do, though,
  • 12:55they had to significantly subset.
  • 12:57They took roughly 10% of the data.
  • 13:00So the question is whether their conclusions,
  • 13:02which in fact work yes to the affirmative,
  • 13:06they were able to detect this kind of contagion dynamics.
  • 13:11But the question is, do their results hold
  • 13:14when we analyze the complete data set?
  • 13:18So for likelihood based inference,
  • 13:20which we're going to need to use in order to learn,
  • 13:25in order to apply the Hawkes process to real-world data,
  • 13:30for the first thing to see is that the likelihood
  • 13:34takes on the form of an integral term on the left.
  • 13:39And then we have a simple product of the rate function
  • 13:43evaluated at our individual events, observed events.
  • 13:50And when we consider the log likelihood,
  • 13:53then it in fact will involve this term that I'm showing you
  • 13:58on the bottom line, where it's the sum
  • 14:00of the log of the, again, the rate function evaluated
  • 14:04at the individual events. (background ringing)
  • 14:07I'm sorry.
  • 14:08You might be hearing a little bit of the sounds
  • 14:10of Los Angeles in the background, and there's very little
  • 14:14that I can do about Los Angeles.
  • 14:16So moving on.
  • 14:19So this summation in the log likelihood occurs.
  • 14:25It actually involves a double summation.
  • 14:28So it is the sum over all of our observations,
  • 14:32of the log of the rate function.
  • 14:34And then, again, the rate function because of the very
  • 14:37specific form taken by the self excitatory component
  • 14:41is also gonna involve this summation.
  • 14:45So the upshot is that we actually need to evaluate.
  • 14:49Every time we evaluate the log likelihood,
  • 14:53we're going to need to evaluate N choose two,
  • 14:59where N is the number of data points.
  • 15:01N choose two terms, in this summation right here,
  • 15:06and then we're gonna need to sum them together.
  • 15:09And then the gradient also features this,
  • 15:16quadratic computational complexity.
  • 15:21So the solution, the first solution that I'm gonna offer up
  • 15:23is not a statistical solution.
  • 15:25It's a parallel computing solution.
  • 15:27And the basic idea is, well, all of these terms that we need
  • 15:31to sum over, evaluate and sum over, let's do it all at once
  • 15:36and thereby speed up our inference.
  • 15:41I do so, using multiple computational tools.
  • 15:44So the first one is I use CP, they're just multi-core CPUs.
  • 15:50These can have anywhere from two to 100 cores.
  • 15:54And then I combine this with something called SIMD,
  • 15:58single instruction multiple data, which is vectorization.
  • 16:02So the idea, the basic idea is that I can apply a function,
  • 16:09the same function, the same instruction set to an extended
  • 16:13register or vector of input data, and thereby speed up
  • 16:20my computing by a factor that is proportional
  • 16:24to the size of the vector that I'm evaluating
  • 16:27my function over.
  • 16:29And then, I actually can do something better than this.
  • 16:33I can use a graphics processing unit,
  • 16:35which instead of hundreds cores, has thousands of cores.
  • 16:39And instead of SIMD, or it can be interpreted as SIMD,
  • 16:42but Nvidia likes to call it a single instruction
  • 16:45multiple threads or SIMT.
  • 16:47And here, what the major difference
  • 16:50is the scale at which it's occurring.
  • 16:54And then, the other difference is that actually
  • 16:58individual threads or small working groups of threads
  • 17:01on my GPU can work together.
  • 17:03So actually the tools that I have available are very complex
  • 17:07and a lot of need for care.
  • 17:10There's a lot of need to carefully code this up.
  • 17:13The solution is not statistical, but it's very much
  • 17:18an engineering solution.
  • 17:19But the results are really, really impressive
  • 17:24from my standpoint, because if I compare.
  • 17:27So on the left, I'm comparing relative speed ups against
  • 17:32a very fast single core SIMD implementation on the left.
  • 17:40So my baseline right here is the bottom of this blue curve.
  • 17:44The X axis is giving me the number of CPU threads
  • 17:48that I'm using, between one and 18.
  • 17:52And then, the top line is not using CPU threads.
  • 17:55So I just create a top-line that's flat.
  • 17:58This is the GPU results.
  • 18:01If I don't use SIMD, if I use non vectorized
  • 18:04single core computing, of course, this is still
  • 18:06pre-compiled C++ implementation.
  • 18:08So it's fast or at least faster than R,
  • 18:11and I'll show you that on the next slide.
  • 18:13If I do that, then AVX is twice as fast.
  • 18:17As I increased the number of cores,
  • 18:21my relative speed up increases,
  • 18:24but I also suffer diminishing returns.
  • 18:28And then that is actually all these simulations
  • 18:31on the left-hand plot.
  • 18:33That's for a fixed amount of data.
  • 18:34That's 75,000 randomly generated data points
  • 18:38at each iteration of my simulation.
  • 18:42But I can also just look at the seconds per evaluation.
  • 18:45So that's my Y axis on the right-hand side.
  • 18:49So ideally I want this to be as low as possible.
  • 18:53And then I'm increasing the number of data points
  • 18:56on the Y axis, on the X axis, excuse me.
  • 19:00And then as the number of threads that I use,
  • 19:03as I increased the number of threads,
  • 19:05then my implementation is much faster.
  • 19:08But again, you're seeing this quadratic computational
  • 19:12complexity at play, right.
  • 19:14All of these lines are looking rather parabolic.
  • 19:18Finally, I go down all the way to the bottom,
  • 19:21where I've got my GPU curve,
  • 19:22again, suffering, computational complexity,
  • 19:25which the quadratic computational complexity,
  • 19:27which we can't get past, but doing a much better job
  • 19:31than the CPU computing.
  • 19:32Now you might ask, well, you might say,
  • 19:35well, a 100 fold speed up is not that great.
  • 19:38So I'd put this in perspective and say, well,
  • 19:41what does this mean for R, which I use every day?
  • 19:45Well, what it amounts to,
  • 19:49and here, I'll just focus on the relative speed up
  • 19:51over our implementation on the right.
  • 19:55The GPU is reliably over 1000 times faster.
  • 20:04So the way that Charles Loeffler and Seth Flaxman
  • 20:12obtained a subset of their data was actually
  • 20:16by thinning the data.
  • 20:21They needed to do so because of the sheer computational
  • 20:24complexity of using the Hawkes model.
  • 20:27So, I'm not criticizing this in any way,
  • 20:30but I'm simply pointing out why our results
  • 20:34using the full data set, differ.
  • 20:36So on the left, on the top left,
  • 20:40we have the posterior density for the spatial length scale
  • 20:44of the self excitatory component.
  • 20:46And when we use the full data set,
  • 20:48then we believe that we're operating more at around 70
  • 20:51meters instead of the 126 inferred in the original paper.
  • 20:56So one thing that you might notice is our posterior
  • 21:01densities are much more concentrated than in blue,
  • 21:08than the original analysis in Salmon.
  • 21:12And this of course makes sense.
  • 21:14We're using 10 times the amount of the data.
  • 21:18Our temporal length scale is also meant,
  • 21:20is also, we believe, much smaller, in fact.
  • 21:24So now it's down to one minute instead of 10 minutes.
  • 21:28Again, this could be interpreted
  • 21:29as the simple result of thinning.
  • 21:32And then finally, I just want to focus on this on
  • 21:35the green posterior density.
  • 21:41This is the proportion of events that we're interpreting
  • 21:45that arise from self excitation or contagion dynamics.
  • 21:50Experts believe that anywhere between 10 and 18% of gun
  • 21:56violence events are retaliatory in nature.
  • 21:59So actually our inference is kind of agreeing with,
  • 22:07it safely within the band suggested by the experts.
  • 22:15Actually, another thing that we can do,
  • 22:18and that also requires a pretty computationally.
  • 22:22So this is also quadratic computational complexity.
  • 22:27Again, is post-processing.
  • 22:30So if, for example, for individual events,
  • 22:32we want to know the probability that the event arose
  • 22:38from retaliatory gun violence,
  • 22:41then we could look at the self excitatory component
  • 22:46of the rate function divided by the total rate function.
  • 22:49And then we can just look at the posterior
  • 22:51distribution of this statistic.
  • 22:55And this will give us our posterior probability
  • 22:58that the event arose from contagion dynamics at least.
  • 23:04And you can see that we can actually observe
  • 23:06a very wide variety of values.
  • 23:23So the issue of big data is actually not gonna go away,
  • 23:28as we move on to discussing spatial data precision.
  • 23:33Now, I'll tell you a little bit more about this data.
  • 23:38All the data that we access is freely accessible online,
  • 23:42is rounded to the nearest 100 meters
  • 23:48by the DC Police Department.
  • 23:51And the reason that they do this is for reasons of privacy.
  • 23:58So one immediate question that we can ask is, well,
  • 24:01how does this rounding actually affect our inference?
  • 24:10Now we actually observed wildfires
  • 24:13of wildly different sizes.
  • 24:16And the question is, well, how does...
  • 24:23If we want to model the spread of wildfires,
  • 24:28then it would be useful to know
  • 24:30where the actual ignition site,
  • 24:33the site of ignition was.
  • 24:37Where did the fire occur originally?
  • 24:41And many of these fires are actually discovered
  • 24:44out in the wild, far away from humans.
  • 24:48And there's a lot of uncertainty.
  • 24:50There's actually a large swaths of land that are involved.
  • 24:57Finally, this, this global influenza data
  • 25:00is very nice for certain reasons.
  • 25:03For example, it features all of the observations,
  • 25:07actually provide a viral genome data.
  • 25:10So we can perform other more complex
  • 25:12analyses on the data.
  • 25:14And in fact, I'll do that in the third section
  • 25:17for related data.
  • 25:21But the actual spatial precision for this data is very poor.
  • 25:25So, for some of these viral cases,
  • 25:29we know the city in which it occurred.
  • 25:32For some of them, we know the region
  • 25:34or the state in which it occurred.
  • 25:35And for some of them, we know the country
  • 25:37in which it occurred.
  • 25:40So I'm gonna start with the easy problem,
  • 25:42which is analyzing the DC gun violence, the DC gunshot data.
  • 25:48And here again, the police department rounds the data
  • 25:50to the nearest hundred meters.
  • 25:52So what do we do?
  • 25:53We take that at face value and we simply use,
  • 25:57place a uniform prior over the 10,000 meters square
  • 26:04that is centered at each one of our observations.
  • 26:06So here I'm denoting our actual data,
  • 26:10our observed data with this kind of Gothic X,
  • 26:15and then I'm placing a prior over the location
  • 26:17at which the gunshot actually occurred.
  • 26:19And this is a uniform prior over a box centered at my data.
  • 26:23And using this prior actually has another interpretation
  • 26:28similar to some other concepts
  • 26:33from the missing data literature.
  • 26:36And use of this prior actually corresponds to using
  • 26:40something called the group data likelihood.
  • 26:43And it's akin to the expected, complete data likelihood
  • 26:48if you're familiar with the missing data literature.
  • 26:53So what we do, and I'm not gonna get too much into
  • 26:57the inference at this point, but we actually use MCMC
  • 27:00to simultaneously infer the locations,
  • 27:04and the Hawkes model parameters,
  • 27:08the rate function parameters at the same time.
  • 27:12So here, I'm just showing you a couple of examples
  • 27:15of what this looks like.
  • 27:16For each one of our observations colored yellow,
  • 27:20we then have 100 posterior samples.
  • 27:25So these dynamics can take on different forms
  • 27:28and they take on different forms in very complex ways,
  • 27:32simply because what we're essentially doing when we're...
  • 27:38I'm going to loosely use the word impute.
  • 27:41When we're imputing this data, when we're actually inferring
  • 27:44these locations, we're basically simulating
  • 27:47from a very complex n-body problem.
  • 27:53So on the left, how can we interpret this?
  • 27:57Well, we've got these four points and the model believes
  • 28:01that actually they are farther away
  • 28:02from each other than observed.
  • 28:04Why is that?
  • 28:05Well, right in the middle here, we have a shopping center,
  • 28:09where there's actually many less gunshots.
  • 28:13And then we've got residential areas
  • 28:15where there are many more gunshots on the outside.
  • 28:18And the bottom right, we actually have all of these,
  • 28:26we believe that the actual locations of these gunshots
  • 28:30collect closer together, kind of toward a very high
  • 28:34intensity region in Washington, DC.
  • 28:39And then we can just think about
  • 28:41the general posterior displacement.
  • 28:44So the mean posterior displacement.
  • 28:46So in general, are there certain points that,
  • 28:50where the model believes that the gunshots occurred
  • 28:53further away from the observed events?
  • 28:58And in general, there's not really.
  • 29:01It's hard to come up with any steadfast rules.
  • 29:04For example, in the bottom, right, we have some shots,
  • 29:08some gunshots that show a very large posterior displacement,
  • 29:13and they're in a very high density region.
  • 29:15Whereas on the top, we also get large displacement
  • 29:19and we're not surrounded by very many gunshots at all.
  • 29:21So it is a very complex n-body problem
  • 29:24that we're solving.
  • 29:27And the good news is, for this problem,
  • 29:30it doesn't matter much anyway.
  • 29:32The results that we get are pretty much the same.
  • 29:37I mean, so from the standpoint of statistical significance,
  • 29:42we do get some statistically significant results.
  • 29:45So in this figure, on the top,
  • 29:47I'm showing you 95% credible intervals,
  • 29:51and this is the self excitatory spatial length scale.
  • 29:56We believe that it's smaller,
  • 29:57but from a practical standpoint, it's not much smaller.
  • 30:01It's a difference between 60 meters
  • 30:03and maybe it's at 73 meters, 72 meters.
  • 30:13But we shouldn't take too much comfort
  • 30:16because actually as we increase the spatial prec-
  • 30:19excuse me, as we decrease the spatial precision,
  • 30:22we find that the model that does not take account
  • 30:26of the rounding, performs much worse.
  • 30:29So for example, if you look in the table,
  • 30:33then we have the fixed locations model,
  • 30:37where I'm not actually inferring the locations.
  • 30:40And I just want to see, what's the empirical coverage
  • 30:45of the 95% credible intervals?
  • 30:48And let's just focus on the 95%
  • 30:53credible intervals, specifically,
  • 30:55simply because actually the other intervals,
  • 30:59the 50% credible interval, the 80% credible interval,
  • 31:03they showed the similar dynamic, which is that as we,
  • 31:10so if we start on the right-hand side,
  • 31:13we have precision down to down to 0.1.
  • 31:16This is a unit list example.
  • 31:19So we have higher precision, actually.
  • 31:22Then we see that we have very good coverage,
  • 31:24even if we don't take this locational
  • 31:31coarsening into account.
  • 31:33But as we increase the size of our error box,
  • 31:38then we actually lose coverage,
  • 31:41and we deviate from that 95% coverage.
  • 31:44And then finally, if we increase too much,
  • 31:46then we're never actually going to be
  • 31:51capturing the true spatial length scale,
  • 31:57whereas if we actually do sample the locations,
  • 31:59we perform surprisingly well,
  • 32:01even when we have a very high amount of spatial coarsening.
  • 32:08Well, how else can we break the model?
  • 32:11Another way that we can break this model,
  • 32:13and by break the model, I mean, my naive model
  • 32:16where I'm not inferring the locations.
  • 32:18Another way that we can break this model
  • 32:22is simply by considering data
  • 32:24where we have variable spatial coarsening.
  • 32:28That is where different data points
  • 32:32are coarsened different amounts,
  • 32:34so we have a variable precision.
  • 32:40So considering the wildfire data,
  • 32:43we actually see something with the naive approach
  • 32:48where we're not inferring the locations.
  • 32:51We actually see something that is actually recorded
  • 32:56elsewhere in the Hawkes process literature.
  • 33:00And that is that when we try to use a flexible
  • 33:05background function, as we are trying to do,
  • 33:07then we get this multimodal posterior distribution.
  • 33:13And that's fine.
  • 33:14We can also talk about it in a frequentist,
  • 33:17from the frequency standpoint,
  • 33:19because it's observed there as well
  • 33:21in the maximum likelihood context, which is,
  • 33:25we still see this multimodality.
  • 33:29What specific form does this multimodality take?
  • 33:34So what we see is that we get modes around the places
  • 33:40where the background rate parameters,
  • 33:47the background length scale parameters are equal
  • 33:50to the temporal, excuse me, the self excitatory
  • 33:54length scale parameters.
  • 33:56So for the naive model, it's mode A,
  • 34:00it believes that the spatial length scale
  • 34:03is about 24 kilometers, and that the spatial length scale
  • 34:07of the self excitatory dynamics
  • 34:09are also roughly 24 kilometers.
  • 34:14And then for the other mode,
  • 34:16we get equal temporal length scales.
  • 34:20So here, it believes 10 days, and 10 days
  • 34:24for the self excitatory in the background component.
  • 34:27And this can be very bad indeed.
  • 34:29So for example, for mode A,
  • 34:31it completely, the Hawkes model completely fails
  • 34:36to capture seasonal dynamics, which is the first thing
  • 34:40that you would want it to pick up on.
  • 34:43The first thing that you would want it to understand
  • 34:47is that wildfires...
  • 34:49Okay, I need to be careful here
  • 34:51because I'm not an expert on wildfires.
  • 34:55I'll go out on a limb and say,
  • 34:56wildfires don't happen in Alaska during the winter.
  • 35:03On the other hand, when we use the full model
  • 35:05and we're actually simultaneously inferring the locations,
  • 35:08then we get this kind of Goldilocks effect,
  • 35:11where here, the spatial length scale
  • 35:14is somewhere around 35 kilometers,
  • 35:17which is between the 23 kilometers and 63 kilometers
  • 35:21for mode modes A and B, and we see that reliably.
  • 35:33I can stop for some questions because I'm making good time.
  • 35:44<v Man>Does anybody have any questions, if you want to ask?</v>
  • 35:52<v Student>What's the interpretation</v>
  • 35:53of the spatial length scale and the temporal length scale?
  • 35:56What do those numbers actually mean?
  • 35:59<v ->Yeah, thank you.</v>
  • 36:02So, the interpretation of the...
  • 36:06I think that the most useful interpretation,
  • 36:11so just to give you an idea of how they can be interpreted.
  • 36:15So for example, for the self excitatory component, right,
  • 36:20that's describing the contagion dynamics.
  • 36:23What this is saying is that if we see a wildfire,
  • 36:29then we expect to observe another wildfire
  • 36:34with mean distribution of one day.
  • 36:41So the temporal length scale is in units days.
  • 36:46So in the full model, after observing the wildfire,
  • 36:50we expect to see another wildfire with mean, you know,
  • 36:54on average, the next day.
  • 36:56And this of course, you know, we have this model
  • 37:02that's taking space and time into account.
  • 37:05So the idea though, is that because of the separability
  • 37:10in our model, we're basically simply
  • 37:12expecting to see it somewhere.
  • 37:19<v Student>Thank you.</v>
  • 37:24<v Man>Any other questions?</v>
  • 37:26(man speaking indistinctly)
  • 37:31<v Student>Hi, can I have one question?</v>
  • 37:35<v ->Go head.</v>
  • 37:36<v Student>Okay.</v>
  • 37:38I'm curious.
  • 37:38What is a main difference between
  • 37:39the naive model A and the naive model B?
  • 37:43<v ->Okay.</v>
  • 37:44So, sorry.
  • 37:45This is...
  • 37:47I think I could have presented
  • 37:49this aspect better within the table itself.
  • 37:52So this is the same exact model.
  • 37:58But all that I'm doing is I'm applying
  • 38:01the model multiple times.
  • 38:03So in this case, I'm using Markov chain Monte Carlo.
  • 38:07So one question that you might ask is,
  • 38:10well, what happens when I run MCMC multiple times?
  • 38:16Sometimes I get trapped in one mode.
  • 38:20Sometimes I get trapped in another mode.
  • 38:22You can just for, you know, a mental cartoon,
  • 38:25we can think of like a (indistinct)
  • 38:27a mixture of Gaussian distribution, right.
  • 38:30Sometimes I can get trapped in this Gaussian component.
  • 38:34Sometimes I could get trapped in this Gaussian component.
  • 38:38So there's nothing intrinsically wrong with multimodality.
  • 38:44We prefer to avoid it as best we can simply because it makes
  • 38:47interpretation much more difficult.
  • 38:52In this case, if I only perform inference
  • 38:56and only see mode A, then I'm never actually gonna be
  • 39:00picking up on seasonal dynamics.
  • 39:07Does that (indistinct)?
  • 39:10<v Woman>Yeah, it's clear.</v>
  • 39:12<v Instructor>Okay.</v>
  • 39:13<v Woman>Okay, and I also (indistinct).</v>
  • 39:16So for the full model, you can capture
  • 39:18the spatial dynamic property.
  • 39:21So how to do that?
  • 39:23So I know you need the Hawkes process that sees,
  • 39:25clarifies the baseline.
  • 39:28So how do you estimate a baseline part?
  • 39:32<v ->Oh, okay, great.</v>
  • 39:35In the exact same way.
  • 39:37<v Student>Okay, I see.</v>
  • 39:39<v ->So I'm jointly, simultaneously performing inference</v>
  • 39:45over all of the model parameters.
  • 39:47And I can go all the way back.
  • 39:53Right.
  • 39:54'Cause it's actually a very similar model.
  • 39:58Yes.
  • 39:59So this is my baseline.
  • 40:02And so, for example, when we're talking about that temporal
  • 40:06smooth that you saw on that last figure,
  • 40:09where I'm supposed to be capturing seasonal dynamics.
  • 40:13Well, if tau T, which I'm just calling
  • 40:18my temporal length scale, if that is too large,
  • 40:22then I'm never going to be capturing
  • 40:24those seasonal dynamics, which I would be hoping to capture
  • 40:28precisely using this background smoother.
  • 40:33<v Student>Okay, I see.</v>
  • 40:34So it looks like they assume the formula for the baseline,
  • 40:38and then you estimates some parameters in these formulas.
  • 40:42<v ->Yes.</v>
  • 40:43<v Student>In my understanding,</v>
  • 40:44in the current Hawkes literature,
  • 40:47somebody uses (indistinct) function
  • 40:49to approximate baseline also.
  • 40:52<v ->Yes.</v>
  • 40:52<v Student>This is also interesting.</v>
  • 40:54Thank you. <v ->Yes.</v>
  • 40:55Okay, okay, great.
  • 40:56I'm happy to show another, you know.
  • 40:59And of course I did not invent this.
  • 41:00This is just another tact that you can take.
  • 41:03<v Student>Yeah, yeah, yeah, yeah.</v>
  • 41:04That's interesting.
  • 41:05Thanks
  • 41:06<v ->Yup.</v>
  • 41:10<v Student>As just a quick follow up on</v>
  • 41:13when you were showing the naive model,
  • 41:16and this maybe a naive question on my part.
  • 41:20Did you choose naive model A to be the one
  • 41:24that does the type seasonality or is that approach
  • 41:27just not (indistinct) seasonality?
  • 41:33<v ->So I think that the point</v>
  • 41:38is that sometimes based on, you know,
  • 41:42I'm doing MCMC.
  • 41:44It's random in nature, right.
  • 41:46So just sometimes when I do that,
  • 41:49I get trapped in that mode A,
  • 41:53and sometimes I get trapped in that mode B.
  • 42:00The label that I apply to it is just arbitrary,
  • 42:04but maybe I'm not getting your question.
  • 42:11<v Student>No, I think you did.</v>
  • 42:14So, it's possible that we detect it.
  • 42:17It's possible that we don't.
  • 42:20<v ->Exactly.</v>
  • 42:21And that's, you know,
  • 42:22<v Student>That's what it is.</v>
  • 42:23<v ->multimodality.</v>
  • 42:25So this is kind of nice though,
  • 42:27that this can actually give you,
  • 42:30that actually inferring the locations can somehow,
  • 42:35at least in this case, right,
  • 42:37I mean, this is a case study, really,
  • 42:40that this can help resolve that multimodality.
  • 42:47<v Student>Thank you.</v>
  • 42:48Yeah.
  • 42:49<v Student>So back to the comparison between CPU and GPU.</v>
  • 42:55Let's say, if we increase the thread of CPU,
  • 43:00say like to infinity, will it be possible that the speed
  • 43:06of CPU match the speed up of GPU?
  • 43:12<v ->So.</v>
  • 43:15You're saying if we increase.
  • 43:17So, can I ask you one more time?
  • 43:19Can I just ask for clarification?
  • 43:21You're saying if we increase what to infinity?
  • 43:25<v Student>The thread of CPU.</v>
  • 43:28I think in the graph you're increasing the threads
  • 43:32of CPU from like one to 80.
  • 43:35And the speed up increase as the number
  • 43:39of threats increasing.
  • 43:42So just say like, let's say the threads of CPU
  • 43:45increase to infinity, will the speed up match,
  • 43:51because GPU with like (indistinct).
  • 43:54Very high, right. <v ->Yeah, yeah.</v>
  • 43:57Let me show you another figure,
  • 44:00and then we can return to that.
  • 44:03I think it's a good segue into the next section.
  • 44:07So, let me answer that in a couple slides.
  • 44:10<v Student>Okay, sounds good.</v>
  • 44:12<v ->Okay.</v>
  • 44:13So, questions about.
  • 44:15I've gotten some good questions about how do we interpret
  • 44:18the length scales and then this makes me think about,
  • 44:23well, if all that we're doing is interpreting
  • 44:26the length scales, how much is that telling us about
  • 44:29the phenomenon that we're interested in?
  • 44:32And can we actually craft more complex hierarchical models
  • 44:37so that we can actually learn something perhaps
  • 44:41even biologically interpretable?
  • 44:43So here, I'm looking at 2014, 2016
  • 44:47Ebola virus outbreak data.
  • 44:50This is over almost 22,000 cases.
  • 44:54And of these cases, we have about 1600
  • 45:00that are providing us genome data.
  • 45:08And then of those 1600, we have a smaller subset
  • 45:12that provide us genome data, as well as spatiotemporal data.
  • 45:20So often people use genome data, say RNA sequences in order
  • 45:27to try to infer the way that different viral cases
  • 45:29are related to each other.
  • 45:31And the question is, can we pull together sequenced
  • 45:34and unsequenced data at the same time?
  • 45:39So what I'm doing here is, again,
  • 45:42I'm not inventing this.
  • 45:44This is something that already exists.
  • 45:47So all that I'm doing is modifying my triggering function G,
  • 45:52and giving it this little N,
  • 45:54this little subscript right there,
  • 45:57which is denoting the fact that I'm allowing different viral
  • 46:01observations to contribute to the rate function
  • 46:05in different manners.
  • 46:07And the exact form that that's gonna take on
  • 46:09for my specific simple model that I'm using,
  • 46:12is I'm going to give this this data N.
  • 46:17And I'm gonna include this data N parameter
  • 46:20in my self excitatory component.
  • 46:22And this data N is restricted to be greater than zero.
  • 46:28So if it is greater than one,
  • 46:30I'm gonna assume that actually, this self excite,
  • 46:34excuse me, that this particular observation,
  • 46:37little N is somehow more contagious.
  • 46:41And if data is less than one,
  • 46:43then I'm going to assume that it's less contagious.
  • 46:48And this is an entirely unsatisfactory part of my talk,
  • 46:52where I'm gonna gloss over a massive part of my model.
  • 46:58And all that I'm gonna say is that
  • 47:02this Phylogenetic Hawkes process, which I'm gonna be telling
  • 47:05you about in the context of big modeling,
  • 47:09and that challenge is that we start
  • 47:13with the phylogenetic tree, which is simply the family tree
  • 47:16that is uniting my 1600 sequenced cases.
  • 47:22And then based on that, actually conditioned on that tree,
  • 47:25we're gonna allow that tree to inform the larger
  • 47:28co-variants of my model parameters, which are then going to
  • 47:33contribute to the overall Hawkes rate function
  • 47:37in a differential manner, although it's still additive.
  • 47:45Now, let's see.
  • 47:49Do I get to go till 10 or 9:50?
  • 47:57<v Man>So you can go till 10.</v>
  • 47:59<v ->Okay, great.</v>
  • 48:00So then, I'll quickly say that if I'm inferring
  • 48:06all of these rates, then I'm inferring over 1300 rates.
  • 48:13So that is actually the dimensionality
  • 48:15of my posterior distribution.
  • 48:21So a tool that I can use,
  • 48:23a classic tool over 50 years old at this point,
  • 48:26that I can use, is I can use the random walk metropolis
  • 48:29algorithm, which is actually going to sample
  • 48:32from the posterior distribution of these rates.
  • 48:36And it's gonna do so in a manner that is effective
  • 48:40in low dimensions, but not effective in high dimensions.
  • 48:46And the way that it works is say,
  • 48:47we start at negative three, negative three.
  • 48:49What we want to do is we want to explore this high density
  • 48:52region of this bi-variate Gaussian,
  • 48:55and we slowly amble forward, and eventually we get there.
  • 49:03But this algorithm breaks down in moderate dimensions.
  • 49:07So.
  • 49:11An algorithm that I think many of us are aware of
  • 49:14at this point, that is kind of a workhorse
  • 49:16in high dimensional Bayesian inference
  • 49:18is Hamiltonian Monte Carlo.
  • 49:20And this works by using actual gradient information about
  • 49:24our log posterior in order to intelligently guide
  • 49:28the MCMC proposals that we're making.
  • 49:32So, again, let's just pretend that we start
  • 49:34at negative three, negative three,
  • 49:36but within a small number of steps,
  • 49:38we're actually effectively exploring
  • 49:40that high density region, and we're doing so
  • 49:45because we're using that gradient information
  • 49:47of the log posterior.
  • 49:51I'm not going to go too deep right now into the formulation
  • 49:56of Hamiltonian Monte Carlo, for the sake of time.
  • 50:00But what I would like to point out,
  • 50:04is that after constructing this kind of physical system
  • 50:13that is based on our target distribution
  • 50:20on the posterior distribution, in some manner,
  • 50:24we actually obtain our proposals within the MCMC.
  • 50:30We obtain the proposals by simulating, by forward simulating
  • 50:35the physical system, according to Hamilton's equations.
  • 50:40Now,
  • 50:43what this simulation involves is a massive number
  • 50:48of repeated gradient evaluations.
  • 50:53Moreover, if the posterior distribution is an ugly one,
  • 51:00that is if it is still conditioned, which we interpret as,
  • 51:06the log posterior Hessian has eigenvalues
  • 51:09that are all over the place.
  • 51:12Then we can also use a mass matrix, M, which is gonna allow
  • 51:17us to condition our dynamics, and make sure that we are
  • 51:24exploring all the dimensions of our model in an even manner.
  • 51:29So the benefit of Hamiltonian Monte-Carlo is that it scales
  • 51:32to tens of thousands of parameters.
  • 51:34But the challenge is that that HMC necessitates repeated
  • 51:38computation at the log likelihood,
  • 51:42it's gradient and then preconditioning.
  • 51:46And the best way that I know to precondition actually
  • 51:49involves evaluating the log likelihood Hessian as well.
  • 51:55And I told you that the challenges that I'm talking about
  • 51:57today are intertwined.
  • 51:58So what does this look like in a big data setting?
  • 52:02Well, we've already managed to speed up the log likelihood
  • 52:06computations that are quadratic in computational complexity.
  • 52:11Well, it turns out that the log likelihood gradient
  • 52:14and the log likelihood Hessian
  • 52:17are all quadratic and computational complexity.
  • 52:21So this means that as the size of our data set grows,
  • 52:24we're going to...
  • 52:27HMC, which is good at scaling to high dimensional models
  • 52:31is going to break down because it's just gonna take too long
  • 52:35to evaluate the quantities that we need to evaluate.
  • 52:43To show you exactly how these parallel
  • 52:45gradient calculations can work.
  • 52:51So, what am I gonna do?
  • 52:53I'm gonna parallelize again on a GPU
  • 52:55or a multi-core CPU implementation,
  • 53:00and I'm interested in evaluating or obtaining
  • 53:04the quantities in the red box.
  • 53:06These are simply the gradient of the log likelihood
  • 53:09with respect to the individual rate parameters.
  • 53:13And because of the summation that it involves,
  • 53:17we actually obtain in the left, top left,
  • 53:21we have the contribution of the first observation
  • 53:25to that gradient term.
  • 53:28Then we have the contribution of the second observation
  • 53:31all the way up to the big int observation,
  • 53:35that contribution to the gradient term.
  • 53:37And these all need to be evaluated and summed over.
  • 53:41So what do we do?
  • 53:42We just do a running total, very simple.
  • 53:45We start by getting the first contribution.
  • 53:49We keep that stored in place.
  • 53:53We evaluate the second contribution,
  • 53:56all at the same time in parallel,
  • 53:57and we simply increment our total observat-
  • 54:01excuse me, our total gradient by that value.
  • 54:05Very simple.
  • 54:06We do this again and again.
  • 54:08Kind of complicated to program, to be honest.
  • 54:11But it's simple.
  • 54:16It's simple when you think about it from the high level.
  • 54:19So I showed you this figure before.
  • 54:21And well, a similar figure before,
  • 54:24and the interpretations are the same,
  • 54:26but here I'll just focus on the question that I received.
  • 54:30In the top left, we have the gradient.
  • 54:32In the bottom left, excuse me,
  • 54:34top row, we have the gradient.
  • 54:35Bottom row, we have the Hessian,
  • 54:37and here I'm increasing to 104 cores.
  • 54:42So this is not infinite cores, right.
  • 54:46It's 104.
  • 54:47But I do want you to see that there's diminishing returns.
  • 54:54And to give a little bit more technical
  • 54:57response to that question,
  • 55:02the thing to bear in mind is that
  • 55:04it's not just about the number of threads that we use.
  • 55:08It's having a lot of RAM very close
  • 55:12to where the computing is being done.
  • 55:15And that is something that GPUs,
  • 55:18modern gigantic GPS do very well.
  • 55:26So why is it important to do all this parallelization?
  • 55:28Well, this is really, I want to kind of communicate
  • 55:32this fact because it is so important.
  • 55:36This slide underlines almost the entire challenge
  • 55:40of big modeling using the spatiotemporal Hawkes process.
  • 55:44The computing to apply this model to the 20,000 plus
  • 55:49data points took about a month
  • 55:54using a very large Nvidia GV100 GPU.
  • 56:00Why?
  • 56:01Because we had to generate 100 million Markov chain states
  • 56:04at a rate of roughly three and a half million each day.
  • 56:11After 100 million Markov chain states,
  • 56:15after generating 100 million Markov chain states,
  • 56:20this is the empirical distribution on the left
  • 56:23of the effective sample sizes across,
  • 56:28across all of the individual rates that we're inferring,
  • 56:31actually all the model parameters.
  • 56:34The minimum is 222, and that's right above my typical
  • 56:39threshold of 200, because in general, we want the effective
  • 56:43sample size to be as large as possible.
  • 56:48Well, why was it so difficult?
  • 56:50Well, a lot of the posterior,
  • 56:53a lot of the marginal posteriors
  • 56:55for our different parameters were very complex.
  • 57:01So for example, here, I just have one individual rate,
  • 57:05and this is the posterior that we learned from it.
  • 57:08It's bi-modal.
  • 57:10And not only is it bi-modal,
  • 57:11but the modes exist on very different scales.
  • 57:16Well, why else is it a difficult posterior to sample from?
  • 57:19Well, because actually, as you might imagine,
  • 57:22these rates have a very complex correlation in structure.
  • 57:28This is kind of repeating something that I said earlier
  • 57:30when we were actually inferring locations,
  • 57:33which is that what this amounts to is really simulating
  • 57:36a very large n-body problem.
  • 57:44But what's the upshot?
  • 57:45Well, we can actually capture these individual rates,
  • 57:51which could give us hints at where to look for certain
  • 57:55mutations that are allowing, say in this example,
  • 58:01the Ebola virus to spread more effectively.
  • 58:05And here, red is generally the highest,
  • 58:09whereas blue is the lowest.
  • 58:13We can get credible intervals,
  • 58:15which can give us another way of thinking about, you know,
  • 58:18where should I be looking
  • 58:22in this collection of viral samples, for the next big one?
  • 58:29And then I can also ask, well, how do these rates actually
  • 58:32distribute along the phylogenetic tree?
  • 58:37So I can look for clades or groups of branches
  • 58:41that are in general, more red in this case than others.
  • 58:53So, something that I...
  • 58:55Okay, so it's 10 o'clock, and I will finish in one slide.
  • 59:03The challenges that I'm talking about today,
  • 59:05they're complex and they're intertwined,
  • 59:08but they're not the only challenges.
  • 59:10There are many challenges in the application
  • 59:14of spatiotemporal Hawkes models,
  • 59:16and there's actually a very large literature.
  • 59:21So some other challenges that we might consider,
  • 59:25and that will also be extremely challenging to overcome
  • 59:31in a big data setting.
  • 59:32So, kind of the first challenge is flexible modeling.
  • 59:38So here, we want to use as flexible
  • 59:41of a Hawkes model as possible.
  • 59:44And this challenge kind of encapsulates one of the great
  • 59:49ironies of model-based nonparametrics, which is that,
  • 59:55the precise time that we actually want to use
  • 59:58a flexible model, is the big data setting.
  • 01:00:03I mean, I don't know if you recall my earlier slide
  • 01:00:07where I was showing the posterior distribution
  • 01:00:10of some of the length scales associated with
  • 01:00:13the Washington DC data, and they're extremely tight.
  • 01:00:19But this is actually exactly where we'd want to be able
  • 01:00:24to use a flexible model, because no matter what,
  • 01:00:28if I apply my model to 85,000 data points,
  • 01:00:32I'm going to be very certain in my conclusion,
  • 01:00:36conditioned on the specific model that I'm using.
  • 01:00:41There's also boundary issues, right.
  • 01:00:43This is a huge, a huge thing.
  • 01:00:45So for those of you that are aware
  • 01:00:47of the survival literature, which I'm sure many of you are,
  • 01:00:52you know, they're censoring.
  • 01:00:54So what about gunshots that occurred right outside
  • 01:00:57of the border of Washington DC, and it occurred as a result
  • 01:01:01of gunshots that occurred within the border?
  • 01:01:03And then we can flip that on its head.
  • 01:01:05What about parent events outside of Washington DC
  • 01:01:10that precipitated gun violence within Washington DC.
  • 01:01:13And then finally, sticking with the same example,
  • 01:01:16differential sampling.
  • 01:01:20You can be certain that those acoustic gunshot locators,
  • 01:01:27location system sensors are not planted
  • 01:01:30all over Washington DC.
  • 01:01:34And how does their distribution affect things?
  • 01:01:41Okay.
  • 01:01:42This is joint work with Mark Suchard at UCLA, also at UCLA.
  • 01:01:45And then my very good friend,
  • 01:01:47my very dear friend, Xiang Ji at Tulane.
  • 01:01:50It's funded by the K-Award Big Data Predictive Phylogenetics
  • 01:01:54with Bayesian learning, funded by the NIH.
  • 01:01:58And that's it.
  • 01:01:59Thank you.
  • 01:02:06<v Man>All right.</v>
  • 01:02:07Thank you so much, Professor Holbrook.
  • 01:02:08Does anybody have any other questions?
  • 01:02:11(people speaking indistinctly)
  • 01:02:18Yeah.
  • 01:02:21Any other questions from the room here, or from Zoom?
  • 01:02:25(people speaking indistinctly)