A few days ago, the Institute for Health Metrics and Evaluation (IHME) released estimates of total COVID-19 deaths for countries based on comparisons of total deaths from all causes during the pandemic period with the expected total deaths based on projections of deaths in years before 2020.
IHME estimate that by 3 May 2021, the total number of COVID-19 deaths globally was 6.93 million, more than double the reported number of deaths of 3.24 million. The estimated total for the USA was 905,289, 58% higher than the reported number of 574,043 deaths.
The figure below from the IHME website shows a map of the predicted ratio of total COVID-19 deaths to reported COVID-19 deaths for March 2020 to April 2021. Ratios range from very high levels in many Eastern European and Central Asian countries to ratios that are much closer to 1 in several high-income countries. For most countries in sub-Saharan Africa, which have reported relatively low numbers of COVID-19 deaths, the estimated ratios range from about 1.6 to 4.1, suggesting that the total number of COVID-19 deaths in the region is several times higher than previously thought. Similarly, India, the country with the most recent severe wave of cases and deaths, is estimated to have an overall ratio of 2.96, which implies that the total COVID-19 death toll to date is much higher than what has been reported.
Its just on a year since they released projections of total COVID-19 deaths based on a clearly stupid modelling process (see my earlier post) and which was widely criticized by infectious disease modellers and epidemiologists (see here for example). There is evidence the IHME’s extraordinarily optimistic projections for total deaths were seized on by the Trump Whitehouse to minimize the need to do anything to address the pandemic. So have they done a better job this time?
From reading the summary of the methods they have used (see here) it would seem so. They are much more in their comfort zone in the type of modelling needed. They are taking an approach to estimate excess deaths based on total recorded deaths minus a counterfactual baseline determined from projection of death rates for previous years. But additionally, and beyond what has been done by other estimates of excess deaths, they say that they are aiming to take into account changes in death rates during the pandemic from the following causes:
- deaths caused by COVID-19 infection
- the increase in mortality due to needed health care being delayed or deferred during the pandemic;
- the increase in mortality due to increases in mental health disorders including depression, increased alcohol use, and increased opioid use;
- the reduction in mortality due to decreases in injuries because of general reductions in mobility associated with social distancing mandates;
- the reductions in mortality due to reduced transmission of other viruses, most notably influenza, respiratory syncytial virus, and measles; and
- the reductions in mortality due to some chronic conditions, such as cardiovascular disease and chronic respiratory disease, that occur when frail individuals who would have died from these conditions died earlier from COVID-19 instead.
They refer to a Netherlands study that suggested direct COVID-19 deaths may be higher than estimated excess deaths because deaths due to some other causes have declined during the pandemic.
This is all good stuff, but they then go on to say that there is insufficient data to estimate the impact on excess mortality rates of these causes other than COVID-19, and so they just calculate total excess deaths like everyone else has to date, as total deaths minus counterfactual expected deaths based on projection of previous years deaths and assume the excess is all due to COVID-19. Disappointing! They do make a back-of-the-envelope estimate that there may have been a reduction of up to 615,000 deaths globally, resulting from behavioral changes.
For countries without available deaths data, and where the available information suggests substantial under-reporting of COVID-19 deaths, the IHME developed a covariate-based prediction model based on infection-detection rates and location-specific fixed effects derived from published studies. As usual, their models are complex and not easily examined to understand exactly how they work and whether in fact they are producing defensible results.
The COVID-19 death statistics most commonly reported and available on websites such as or Our World in Data are mostly based on confirmed COVID-19 deaths, usually defined as a death within 28 days of a positive COVID-19 test, and reported by many countries daily or weekly. Clearly these statistics depend on testing rates and quality of reporting systems. In many countries particularly in the early phase of the pandemic deaths were only recorded from hospitals, missing many at home and in nursing homes.
A number of researchers and media organizations are estimating the different between total recorded deaths and the predicted expected deaths based on trends for earlier years, similar to the estimates just released by IHME. These estimates refer to varying time periods from the start of the epidemic or to weekly totals and can be quite sensitive to projection methods for estimating the counterfactual expected numbers of deaths. Excess deaths also represent not only the impact of COVID-19 but also the mostly as yet unknown impacts of mortality increases and decreases from other causes discussed above.
I have compiled estimates of the ratio of total excess deaths to reported COVID-19 deaths from a number of sources for comparison with the IHME estimates. The table below shows estimates for selected European countries and the USA, all of which have essentially complete registration of deaths.
* Excess deaths calculated by me from total annual deaths data for years 2015 to 2020 using
ordinary regression of the log of total death rate against year
(1) Kontis et al. Nature 2020
(2) BBC News (2020)
(3) Karlinski and Kobak (2021)
(4) Our world in data (2021)
(5) The Economist (2021)
(6) IHME (2021)
The first thing to note is the quite wide variation in estimates from different sources. In some cases this may relate to different time periods, but for the lower four sources, which all relate to deaths up to either end 2020 or to Feb/Mar 2021, there are some quite wide variations such as a range from 1.2 to 1.6 for the USA, or 0.7 to 1.5 for the USA, and -1.5 to 1.5 for Norway. Norway illustrates a limitation of this approach in countries with low total COVID deaths (reported total around 436 for 2020) where small differences in projection methods can mean a negative versus positive estimate of the excess. Italy and Spain both had quite large epidemics, and their ratios are more consistent across sources.
In general, the IHME ratios are higher than those from other analyses, suggesting that they are projecting lower counterfactual deaths for the pandemic period. Their projection method is much more complex than any of the others, and possibly is doing a better job. But its inherently difficult to assess the “accuracy” of counterfactuals. In this case, I think what is needed is for another group to do a similarly more sophisticated set of projections from the same data, taking into account the various factors such as seasonality that IHME did, and see whether their results are similar. How much method dependence is influencing the differences above is as yet unclear.
Finally, there is a third type of COVID-19 mortality statistic which is starting to become available. Both the USA and UK have released statistics for the year 2020 based on death registration data which show numbers of deaths where the certificate mentioned COVID-19 as an underlying or contributory cause of death. The USA also released the number of deaths in which COVID-19 is specified as the underlying cause by the certifying doctor. Both of these statistics will differ from either the reported (confirmed) COVID-19 deaths or the estimated excess deaths. COVID-19 was specified as the underlying cause in 91% of the deaths where COVID-19 was mentioned in the USA. The magnitude of this difference, and with other types of COVID-19 mortality estimates will depend in part on the ICD rules for determining underlying cause from all the contributory causes and in part on national (and individual doctor) idiosycracy in applying or modifying the ICD rules.