Intro

I liked the Financial Times plots for tracking the evolution of COVID-19 (https://www.ft.com/coronavirus-latest), but then they changed to different plots. So here I am more-or-less reproducing those plots (and adding some others). This is generated from an Rmarkdown document that I’ll be rerendering daily.

Country-level data

For now using the country data from https://covid.ourworldindata.org:

cases = read.csv("https://covid.ourworldindata.org/data/ecdc/total_cases.csv",
                 stringsAsFactors=FALSE)
cases$date = as.POSIXct(cases$date)
cases$doy = round(as.numeric(difftime(cases$date, as.POSIXct("2019-12-31"), units="days")))

deaths = read.csv("https://covid.ourworldindata.org/data/ecdc/total_deaths.csv",
                  stringsAsFactors=FALSE)
deaths$date = as.POSIXct(deaths$date)
deaths$doy = round(as.numeric(difftime(deaths$date, as.POSIXct("2019-12-31"), units="days")))

# World population data from worldbank.org:
temp = tempfile()
download.file("http://api.worldbank.org/v2/en/indicator/SP.POP.TOTL?downloadformat=csv", temp)
pop = read.csv(unz(temp, "API_SP.POP.TOTL_DS2_en_csv_v2_1120881.csv"),
               skip=4, header=TRUE, stringsAsFactors=FALSE)
rownames(pop) = pop$Country.Name
#deaths.cty = colnames(deaths)
#deaths.cty = gsub(".", " ", deaths.cty, fixed=TRUE)
#ctymap = match(deaths.cty, rownames(pop))
# deaths.cty[which(is.na(ctymap))]
unlink(temp)

Country-level deaths and cases

Daily deaths

Smoothed

Log-scale

Linear scale

Raw

Log-scale

Linear-scale

Daily new cases

Smoothed

Log-scale

Linear scale

Raw

Log-scale

Linear-scale

Cumulative deaths

Log-scale

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Linear-scale

Cumulative cases

Log-scale

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Linear-scale

Country-level naive case fatality rates

By “naive” I mean: at any point in time, divide the total cumulative number of deaths by the total cumulative number of confirmed cases. This will be biased high because the denominator is too small because not all cases are detected (due to lack of testing), but on the other hand will be biased low because some current active cases will result in deaths later. It should eventually converge on the true CFR if testing becomes widespread and the epidemic dies down.

Log-scale

Linear-scale

US state-level data

So far I’m using data collated by the NY Times. It is pretty strangely structured, but oh well:

states = read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv",
                  stringsAsFactors=FALSE)
states$date = as.POSIXct(states$date)
states$doy = round(as.numeric(difftime(states$date, as.POSIXct("2019-12-31"), units="days")))

# Get this into a more sensible structure:
us.deaths = t(tapply(states$deaths,
                     list(factor(states$state, levels=sort(unique(states$state))),
                          factor(states$doy, levels=min(states$doy):max(states$doy))),
                     identity))
us.deaths = as.data.frame(us.deaths)
dates = sort(unique(states$date))
us.deaths$date = dates
doys = sort(unique(states$doy))
us.deaths$doy = doys

us.cases = t(tapply(states$cases,
                    list(factor(states$state, levels=sort(unique(states$state))),
                         factor(states$doy, levels=min(states$doy):max(states$doy))),
                    identity))
us.cases = as.data.frame(us.cases)
us.cases$date = dates
us.cases$doy = doys

US state-level deaths and cases

Daily deaths

Smoothed

Log-scale

Linear scale

Raw

Log-scale

Linear-scale

Daily new cases

Smoothed

Log-scale

Linear scale

Raw

Log-scale

Linear-scale

Cumulative deaths

Log-scale

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Linear-scale

Cumulative cases

Log-scale

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)

Linear-scale

US state-level naive case fatality rates

Log-scale

Linear-scale