Uncategorized

is the number of people, of each infected person goes on to infect And this we allow it to be time varying because we know that it varies across time, and oftentimes this is also weighted by the amount of time between infection and infecting another person Then we also have to model death, and deaths are often modeled as a second step Usually, we use something like a negative binomial, that predicts the number of daily deaths conditional on the recent number of infections It’s also important to remember that a lot of these probabilities we’re modeling are conditional and not marginal probabilities Then we have statistical models Here we can actually have a lot of flexibility What you want to do is, you want to fit the data that you observed as a function of time and other covariates And a lot of the popular ones are loglinear, ARIMA- which is the AutoRegressive Moving Average ones, exponential, logistic. Generally these can accommodate the sigmoidal shape of cumulative counts And sigmoidal, that’s sort of that s- shape that you see, that has an exponential part, and then sort of a logarithmic part These type of models can account for this pretty easily, and they can also incorporate time-varying covariates like mobility data, social media information. So I know Google and Apple have been tracking mobility and we can put these into our models. But, the caveat is it’s hard to forecast forward trends Because they’re sort of unknown in the future, we sort of know what they are for the short term, but we don’t know how trends may change in the future So, it can be harder to incorporate information about information on spread if we’re only modeling deaths because we know that deaths lag infections, but deaths give us a really more reliable data So, a lot of the models that you see, they’re modeling deaths and IHME is one of them because they’re more reliably reported. Every day, I track down news stories of cases and testing and problems with the data And so death tends to be much more reliable, in a statistical precision way And we can incorporate all of this information, but they do lag infections and so what we see is going on in terms of interventions, then we see later on, we’ll see it in infections, and then we’ll see in hospitalizations, and then we’ll finally see it in deaths Another caveat is we often have to model one outcome at a time. We are modeling infections, we are modeling deaths, and then we need to take additional steps to predict other quantities Machine learning. So, neural networks are often used in conjunction with compartmental models This beta, this is a parameter that often relates S to I and here’s two good references where they’ve done a really nice job incorporating machine learning into these models and it can also be used for curve fitting. To our knowledge we’re the only ones that combine all three. Some other models to be aware of, so there’s agent-based models and what they do is, they simulate individuals of a population and how they interact. This gives us a really nice mechanism for modeling the effect of interventions so you don’t have that homogeneity assumptions where everybody in that compartment is the same But, it does require assumptions on human behavior and their interactions within the population, as well as the infectivity of SARS-CoV-2 This is not a trivial matter. Feinman, most famously, was commenting on quantum mechanics, and he was saying how much harder his job would be to model the interactions of electrons if they had feelings Modeling human behavior is not for the faint of heart. Okay, so let’s talk about our model. What we wanted to do is, we wanted to combine all three of these modeling techniques to get a very

because we don’t have community spread. But we can borrow strength across locations and pool those ones to be able to get estimates for this And you can see that there’s a lot of structure to these So we fit this model, this Non-Linear Bayesian Model, and we get the posterior estimates for these locations, and we can model whether or not they have interventions, what interventions they have if they don’t have it, and to incorporate uncertainty What I was saying before is that we run the model separately for each of these half a million posterior samples for this ξ(t), so we can get interval estimates quantifying uncertainty Okay, so now we need to transition out of I’s. So we’ve gone from S to I, and now we want to figure out how do we get out of I? We assume that it’s the inverse of the expected number of infectious days From the literature they say it’s about 14 days, which is why we have the quarantine for 14 days. But we’re statisticians, and we never like to say we’re certain We like to incorporate uncertainty, so, we sample from a Gaussian with a mean of 14 with a variance 1, to be able to incorporate uncertainty into our estimates And then the transition to death is determined by Random Forest, which is a machine learning algorithm that I find is very useful in many applications. It’s very fast and of course, we’re running half a million posterior samples, so we do need to make sure that our stuff scales What is really nice about it is that it allows for easy incorporation of covariates — age, sex, race, comorbidities, density — anything that we think is useful, we can do it and what’s really nice about Random Forest is unlike traditional linear models, you can actually have more covariates than observations and it ignores ones that aren’t useful for prediction So, we fuse all of these together so we can incorporate the strengths of three different modeling approaches And like we’re saying at the beginning – we really relied heavily on the CHIS data, especially for California, because we live here, we were really interested in modeling California at first and CHIS gets really nice representative samples of things like comorbidities, race, gender that is representative of California as a whole. So we could get really good estimates, especially for death, because we know that patients who are older, have comorbidities and also by sex and by race have differing outcomes in terms of morbidity and mortality, and we wanted to be able to incorporate this into our models Another really nice thing about Random Forest is it allows you to look at covariate importance and it really does find what what we expected — that age and comorbidities, but also density — really matters. So people that are living in dense households have higher risk than those that are sort of alone and you know maybe a single family home And so we can look at these and know that it really makes sense across all the data that we’re seeing worldwide, in our model This is just a mean squared error with the permutation importance and the variables value. If you have questions feel free to email me, this is not meant to be a technical talk But just know that our death model really does make sense with what we’re finding across the world Okay so now let’s talk about predictive accuracy. So I’m going to take a minute to explain what the MASE is and it’s just the mean absolute scaled error I know it sounds like a lot, the posterior median number of cases and deaths and we do this separately What we’re looking at is a ratio, our model, versus a model where we’re just saying that cases or deaths are just a random walk of the training data I see this a lot, everybody has a model these days, and I see it on the web where they’re just doing predictions based on where you were, maybe the week before. This is our baseline, you know of the accuracy of a Random Walk Forecast of the training data. So we get to see all the data and just predicts one, two,

three, four weeks forward. So a MASE of one means that we’re not doing any better than just a random walk Less than one means we’re doing much better and here you can sort of see this one here and so when we did this, this was in early Apri. So here it is across time, and we’re doing much better than a random walk For cases we’re at .4 and for deaths a .32, which is well below 1, suggesting that we have very reliable predictions So let’s talk about some of the results and here is California This is data that was done yesterday This is a really fast-moving epidemic, as you know Let me explain what what this output is So, this green line is the cumulative cases, not on the log scale, but on the native scale. So this green line with this little shaded region is it’s like our 95 percent credible interval, but it’s not not quite a credible interval because for the death model, it’s not. Because Random Forest is not a probability model, but for here it is So this is what our model and each of these data points are actual data, and we plot these after we do our models. Our model is fitting pretty well, so this is the cumulative number of cases in California, by time, and we’ve projected out through August. This one is active infections, which is different from new infections and different from cumulative infections Because people stay infectious for on average 14 days, and they can go on to infect other people, and people that were infected are now becoming uninfected So, in pink, this is what this is and this is our 95 credible interval of where we expect the number of active infections. So we don’t have any dots to plot here, because this is an estimate and nobody knows the number of active infections and so these are our estimates. The orange, these are our deaths and so our death model and because it is Random Forests, it’s very jagged, because even though it’s sort of smooth, it is a discrete step function So each of these dots, is a data point and there is a temporal trend in when cases are reported and also deaths, with fewer deaths being reported on weekends And then after the weekend and holidays we see an increase in death, which is why it sort of goes like this So a little happy note and I really, — fingers crossed — hope that it stays true for our projections each day, our numbers of deaths projected forward is starting to slightly decrease So hopefully, California, we won’t see high levels of deaths like we saw in New York, but deaths do lag infections, and we are waiting to see what happened after Fourth of July And this last one is R(t), so this is this R value and we allow it to move across time and this was early in effect So let me explain a little bit about R, so R(t), it changes over time when R equals one That means that the epidemic is stable, for every person that gets infected, they go on to infect one more person. At 1.2, for every 5 people that get infected, they go on to infect 6 And so this is where we’re really starting to worry. At two is where we start seeing exponential spread. We’re hoping that we’re going to come down, and we’re starting to see signs that the R(t) is decreasing in California. But not quite going below one, so we’re obviously watching this every day Okay Arizona, this is where I’m from and my family’s from, so I worry about it So there’s actual good news coming from Arizona, what looks like their cases, hopefully it stays this way, are peaking and starting to decrease We’re also seeing this in deaths But it’s a really wide interval, so it really depends on people really coming together — or coming together by staying apart — because you can see that the the predicted R(t) is still well above

one. But I’m really hoping that the death rate stays low. But in terms of new cases It actually looks like they may be peaking, but that assumes that what’s happening continues to stay that way, but some actual bit of good news that has been developing this week New York is looking really good, they were hit really really hard and it looks like their death rate is not increasing, they did have a little bump up in transmission here, probably from Memorial Day and all the activities afterwards, but it looks like they are well under control. Georgia, not as much, but we are expecting an increase in deaths, but not as big of an increase that we were seeing, maybe a week ago I tend to be an optimist and try and look at the happier side Florida, we can see that we’re expecting a massive increase in deaths there and really high levels of transmission Michigan’s looking like – it was above, but maybe we’re not going to see as large increases in deaths. Minnesota as well. So another caveat: a lot of people use these online tools, which I think are great, and I’m really happy people are doing that, but you really need to, well I’m a statistician, so I always like to look at the data. And when North Carolina at the beginning of June, it was really different from what we were finding in North Carolina and so what I wanted to do is plot what they were predicting — and this is theirs here in red. — versus what our model was predicting and overlay the actual new cases and they were predicting that they were okay, that their spread was not increasing and that was the opposite of what our model was finding Always look at the data. I don’t trust anybody else’s analysis. I always, always, always want the data and to do it myself So we combined three models, so we can get more accurate predictions of cases and deaths, it’s a very flexible and we hope to be able to help policymakers make informed decisions with this. Here is the the paper and I’m happy to send it to anyone if they want more information and these are the two papers that I referenced in the talk and I’m happy to take any questions Good afternoon everyone, my name is Tiffany Lopes and I’m the Director of Communications here at the UCLA Center for Health Policy Research We’re going to be going through questions from the audience today, I encourage you to continue to use the Q/A function in Zoom to ask questions We’ve got a lot of great questions to get through, so let’s begin with the first question. How do we know that recovery or death are the only outcomes, and is it possible to continue having this virus in some form and should that be in a model, if so? Yeah, so our models are very flexible and we can model many, many different outcomes for which we have data and so where we actually have data are cases and deaths, so this is what are on the state dashboards Hospitalization data is coming It has not been as reliable as we would hope because we can easily model hospitals and so with any infection, you either recover or you die. So we just put it in these sort of dichotomous terms, but there is a whole host of things that can be modeled in between. We just need the data to be able to get good models on this I’m hoping that the hospital’s hospitalization data will become really good and that’d be something. We do model it now, but just separately in different states for which we have good hospitalization data But it is an excellent question and yes, we are really sort of limited on the data, but we do do that and our models do allow for hospitalization data Thank you. What data sources are best for this type of modeling, and do you believe that the shifting of COVID data from the CDC to a database more overseen by the presidential administration will affect our ability to model COVID accurately? For most people that I know that are modeling, we’re not using the CDC data. Most

people that I see were using COVID tracking data Which is done by The Atlantic or by The New York Times or by the states themselves. The CDC data tends to lag by at least a week often times and for this we need daily data and so I’m literally waiting for each state, each day to upload their new data and and then we take it So actually, this doesn’t affect our model at all because we actually we don’t use it So I do find COVID tracking project and also a 1.3 acres. So one of our students was a part of this and they do a very good job What a lot of people are doing is, they’re just scraping each of these state sites for the data, because we are finding that data reliability is a massive issue and it’s hard to reconcile the data from different sources So a lot of people do scrape it from different sites. But as for us, we do not use CDC data, so it won’t change us. Got it, how well did your model predict the increased number of cases expected when California and Los Angeles rolled back the stay-at-home orders? I can show you Yeah because we do have a Bayesian model. So you can see, maybe I could have put little arrows for each of these things and we can actually project out pretty far But the more you project, the least the less reliable it is. But yeah, so we we actually sort of see things before it’s reported And so I think our model is really good and gives, especially in this R(t), really sort of gives us an idea of where we’re going forward and also the deaths do lag, but it’s actually quite accurate and you can sort of see with the depth that it it does track really well Thank you, someone is interested in knowing more about how to interpret R values such as, 1.2 or 1.5, and aren’t those values very concerning giving rise to growing number of cases, and shouldn’t those estimates be relevant to decisions such as whether to open schools? Yes and so countries actually like Germany, regularly report the R values but you know you need to take the totality of evidence This epidemic is really complex, and it’s really hard to really summarize everything into one number. So, the easy way to understand R is when it’s one. For every one person they infect, one more person – so you see this plateau. At 1.2, for every five people, six people are being infected At two is where it really becomes exponential and so two people infect four Four people infect 16, and so that’s where you really see these uncontrolled dynamics And we really do think policy makers need to look at everything including death curve, because what we really need to do is, we cannot overwhelm our hospital systems, because if our hospitals fall, then people who are sick can’t get treatment, and perhaps these people will die when they could have been saved had there been a bed. I mean, fortunately in New York that did not happen Nobody was denied a bed that needed it, nobody was denied a ventilator, thankfully. And our ability to treat COVID has actually gotten better. You know the treatment is better and we’re seeing this in the death rate. In New York, the death rate was incredibly high, and we’re not seeing even though we’re seeing cases go high, we’re not seeing at least for right now, we’re not seeing the death rates get high. And I really think that our wonderful doctors and nurses and scientists have really done a good job and been able to prevent death in COVID. So that that really is good

in the U.S.A.? Oh an excellent question, I actually have not done that, but that would be really interesting. Like I said our technology increases our ability to treat increases. So I actually really have faith in our medical community and our scientific community that even if our cases keep going up and everybody gets susceptible or gets infected, our medical ability will be able to save a lot of these people that may have died, had they been infected earlier in the epidemic. But I also hope everybody doesn’t get infected, so no I have not done that. This is a really interesting question, is it possible to include housing density into your models to account for vulnerable populations increasing, having to move in together due to loss of employment and service sectors as attempted mitigation of homelessness? Absolutely and that is one of the things that we incorporated, this population, this density. And so because there, we just did simple for states, you know just sort of the average density. So knowing that New York City is much more dense than Wichita Falls We wanted to be able to incorporate that, but it’s an absolutely brilliant question We see this when we are looking at different counties and actually different locations around Los Angeles County that places that are hit harder are ones that are much more dense or have a higher proportion of multi-family living arrangements, because you just have that much more contact between people. And places that have a mass use of subways. Anywhere you see a lot of people congregated together, these give chances for this virus to to infect other people So we put it in with our death model, but we could actually incorporate it in our mixed model as well Are there any policymakers that you’re working with? Are there any policymakers that are using your models and which policies have been informed by your models? So early on we were talking with Mayor Garcetti’s office to use masks and Mayor of London and also some people in South Africa. Mainly right now, we are working with businesses to help them open safely and how to protect their workers But we’re happy to help anybody that that wants our help. Thank you, unfortunately those are all the questions we have time for this afternoon If we didn’t get to your question, please email us at healthpolicy.ucla.edu and we will get back to you. I just wanted to thank you all for attending this month’s webinar on “Combining Traditional Modeling with Machine Learning for Predicting COVID-19” and a big thank you to Dr. Christina Ramirez for presenting this very timely study We will be posting the recording of this webinar online with closed captions within the next two weeks, so visit healthpolicy.ucla.edu As a reminder, if you’d like a copy of today’s presentation you can also email us at healthpolicy.ucla.edu and stay tuned for details on our next webinar in August featuring the students and activities of the Native Hawaiian Pacific Islander COVID-19 race tracker lab happening right here at the UCLA Center for Health Policy Research have a wonderful rest of the day. Thank you

## You Want To Have Your Favorite Car?

We have a big list of modern & classic cars in both used and new categories.