Subtitles section Play video
- [Narrator] The goal of this video
is to help us all estimate
the actual new COVID-19 cases per day in your area.
And it's based on analysis by Thomas Pueyo,
he wrote an incredible blog post on Medium.
This is the link and I'll also include it
in the description below.
This is the data that he uses to do some of his analysis.
Now, some of you might be thinking,
I know the number of COVID cases in my area,
they're reporting it on the news every day.
But that's the reported number of cases
and that's based on the people
that happened to get the test.
There are a lot of people who might not have symptoms yet
or their symptoms are not severe enough to get the test yet.
So the actual cases are likely far larger
than the number of confirmed cases.
And we can see that in graphical form.
Once again, this is a diagram put together by Thomas Pueyo.
It's a screenshot from his blog post
which once again could be found here.
This is all his analysis, or based off of his analysis,
but this shows you what was happening in Hubei Province,
which is the province where Wuhan is.
And there's several interesting things here.
The vertical axis is the number of cases
and what we see on the horizontal axis is per day.
And so for example, we could pick January 23.
The yellow bar tells us the number of confirmed new cases
that day.
So these are people who would have been tested
and then they tested positive,
and it looks like that number is about 300.
But then we have this gray bar.
This gray bar is the actual number of new cases that day,
which is close to 2,500.
So roughly eight times as high.
Now you might be saying,
how did they know the actual number of cases
if they didn't test everyone?
Well, the way they did that is
when someone tested positive, they asked them,
when did you first get the symptoms?
And if they said, Hey, I first got the symptoms 10 days ago,
they would be included as a true new case.
An actual new case 10 days before that on January 13,
so that Chinese officials were able
to actually make these gray bars in hindsight,
based on when people said they first got the symptoms.
And there's a lot of really interesting information here.
First of all, we can see that Wuhan
was shut down on January 23.
So let's draw a line between the pre shut down
and post shut down.
And you can see just as the city officials
were starting to see confirmed cases,
the actual cases were far higher,
but then they shut down the city
essentially significantly slowing down the spread rate.
And a few days later, the actual cases
which were they were able to calculate in hindsight,
start to flatten out and then go down.
But even though they were going down,
the confirmed new cases continued to go up
because there is a delay.
You can even see the delay right over here.
And that is roughly the amount of time
between when people show symptoms
and they are actually tested.
Now you might be saying, all right, this isn't too bad.
It looks like things eventually became okay for Wuhan.
But this is because they did a very serious shut down.
If they did not do this shut down
and slow the spread of the virus,
you would have seen this exponential growth continue.
It's also worth remembering what I just drew this curve on.
This isn't the total number of cases.
This is the number of new cases per day.
If you want the total number of cases
at a given point in time,
you would have to sum up the gray or the yellow bars
depending on whether you want to look
at actual or confirmed cases.
So as of January 22, if you total up
all of these gray bars over here, as of January 22,
you get approximately 12,000 cases,
while if you add up all of the yellow bars,
that is roughly only 444 confirmed cases.
So before the city even went into shutdown,
and this is what the Chinese doing reasonably good testing,
you had a far higher number of cases
than the confirmed cases would make you believe.
And as large as the ratio is on a given day
before the city shut down,
between the number of actual new cases per day
and the number of confirmed new cases per day,
it's probably higher
in a lot of the geographies where we live,
because we're not testing as well as the Chinese did.
For example.
This is data once again compiled by Thomas Pueyo
on his blog post.
This is just a screen capture of it
and I'm really just giving his analysis.
This shows the total test performed,
and the tests performed per million citizens as of March 3,
and you can see for example,
where I live the United States is not doing so well.
And so the number of reported cases
in places like the United States
where we are really just starting to ramp up testing
is far understating the number of actual cases out there.
So how do we go about estimating
the actual number of cases in our area?
Well, once again, I'm going to use Thomas's analysis,
we're gonna be looking at the number of deaths
and estimations of mortality rate,
time from infection to death,
and how fast the virus actually spreads.
So in other videos, I'll talk more about
some of Thomas's analysis.
But for mortality rate, it'll make the math simple.
And this actually does seem to be a pretty good estimate,
we can assume that there's a 1% mortality rate,
the reports are as low as point 0.6% in South Korea,
and then as high as roughly 5% in places like Iran.
But it looks like the higher numbers
are where the hospital system is being overwhelmed.
And then the lower numbers at the 0.6%,
might not be fully accounting for all of the mortality
that will happen due to the cases
that are actually out there.
So we'll assume a mortality rate of 1%.
The other thing we need to think about
is the time from infection to death in those 1% of cases
where someone does die.
And to figure that out,
I will look at this data right over here.
This top chart, and it comes from this link,
which Thomas cites.
And I'll give the link in the description below.
This is the incubation period.
This is an estimate of the time
from when someone gets infected
to when they start to show symptoms.
And this estimate is roughly five days.
And then once you see symptoms,
how long does it take to death in those 1% of cases,
or whatever the percentage is?
Well, there's varying estimates,
but it looks like to make the numbers easy,
we can estimate roughly 15 days.
So one way to think about it
is five days from infection to showing the symptoms,
and then another 15 days from showing the symptoms to death
for a total of 20 days from infection to death,
in what we're assuming the 1% of cases.
So I'll write 20 days.
And now the other thing we're gonna estimate
is the days to doubling,
days to double.
This is how long does it take for the infection
to double in the population.
And this is gonna be heavily dependent
on what the population is doing,
how dense they are, how much they're interacting.
But we'll look at some of these estimates.
And they're in very different contexts.
And the lower the doubling rate,
that means a virus is spreading very fast.
While if you have a population
that's doing all the right things,
they're taking all the precaution,
the doubling rate will be lower.
So we could look at a conservative estimate
and take a higher doubling rate than all of these estimates,
it'll make our math a little bit easier.
Let's just assume a doubling rate of five days
and I'm using slightly different numbers than Thomas used,
but it will be indicative
and you can do the same analysis
with whatever estimates that you choose to do.
So let's assume five days to double,
which might be conservative,
especially for places like the United States
where we have not taken anywhere near the action
of a place like China or South Korea, or Japan.
So now let's use these numbers
to figure out what might actually be happening in our areas
based on the data that we are presented with.
So let's say that we unfortunately here on some day,
that there is one death in our region or in our city.
Now, based on our estimates,
we're saying that the average time from infection to death
is about 20 days.
That means that that person
would have likely contracted the virus roughly 20 days ago,
20 days ago.
And so I'm gonna make a timeline.
This is 20 days ago, this would be 10 days ago,
10 days ago, this would be 15 days ago,
and then this would be five days ago.
Now it's possible that they were the only person
who contracted the virus on that day,
and then they happen to unfortunately get very sick
and then pass away 20 days later.
But if we assume that the mortality rate is roughly correct,
it's quite possible that 100 people were infected that day.
The person that we know about is that one in 100
who actually gets sick enough to pass away.
And so if we assume that on 20 days ago
that not one person, but 100 people.
So the actual number of people who are infected that day
is 100 infected that day.
Once again, because it's a 1% mortality rate.
If we assumed a 0.5% mortality rate,
then we would say, all right,
there might have been 200 people infected that day,
0.5% of whom get all the way to death 20 days later.
If you assume a 5% mortality rate,
which would be a very unfortunate situation,
but that is a mortality rate that we are seeing
in different parts of the world,
then you would have say,
well, maybe there were 20 people infected that day.
When you only have one or two or three deaths in a region
that will make the estimates more difficult.
But as unfortunately,
we are likely to see a larger number of deaths
in various regions
that will make this these backward estimates
more and more reasonable.
Now if the infection rate in the population doubles
every five days, what is now going to happen?
After five days, you're going to have 200 cases
in your region, 200 cases.
Now, these wouldn't just be new cases,
this would be the cumulative total number of cases
due to those hundred.
Now, this is actually quite conservative,
because this is assuming that those 100
that were infected 20 days ago
are the only infected cases in your region.
There might be other infected cases
that were infected before that date.
But I'm just assuming that the hundred
that were infected that day are the only cases
to be conservative, and so they double after five days,
and then they'll double again after five more days.
And so you will get to 400 cases
after five more days.
And then you will, after five more days,
you will have doubled and I can't even fit it
on the screen anymore.
You're going to have 800 cases
and then that means today just by evidence
of that one death,
you probably have on the order of
and I can't even draw the whole bar,
approximately 1,600 cases.
And so this is just to be a little bit sobering
about how serious this is,
and how much the data that we actually get
is actually lagging the circumstances on the ground,
particularly in places like the United States,
where we are barely even getting started testing.
For example, in my county,
which is Santa Clara County in California.
We just had our second death
unfortunately reported yesterday
and there was another death five days before that.
Now, there's only under 100 reported cases in my county,
but based on this analysis,
the actual number of infected persons in my county
is likely to be at least a factor of 10 more than that,
and it could be as high as 1,000, 2,000, 3,000 people.
We won't know for sure
until we can do the type of hindsight analysis
that the Chinese had,
but this is to just remind us how serious
the situation actually is.
So the big takeaway here
is to take all of this very seriously,
especially because the mortality rate itself can change
depending on how well equipped
the hospital system can handle the situation.
If we all socially isolate and take the proper precaution,
the spread rate will lower
and we won't overwhelm the hospital system.
And we'll hopefully be able to keep the mortality rate
as low as possible.
But if we don't take the precaution,
and if we're just complacent
because we see this lagging data
that's being reported to us because of the lack of testing
in places like the United States,
then it's very possible
that we eventually overwhelm the hospital system
in the next few weeks,
which would cause the mortality rate to go higher.