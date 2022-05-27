You may have heard the word “methodology" bandied about over the last few weeks. Thank the World Health Organization (WHO), if you’re so inclined.

The WHO recently published a study showing that the actual number of deaths due to covid are far higher, the world over, than various countries have reported. For example, here’s what the study said about the first year of the pandemic, 2020: “By 31 December 2020, [the official death toll worldwide] stood at 1,813,188. Yet preliminary estimates suggest the total number of global deaths attributable to the covid-19 pandemic in 2020 is at least 3 million, representing 1.2 million more deaths than officially reported." There was a similar discrepancy for 2021.

Regarding India in particular, the WHO study suggested a death toll in 2020-21 much higher than India’s own announced figure. This difference is what has prompted the recent focus on “methodology": responding to the WHO study, the government of India has questioned the methodology that the WHO used.

So let’s talk about methodology—the WHO’s, and then another recent report that uses a different one. In both cases, how did these researchers come to their numbers? There are various reasons that countries might under-report deaths due to covid, and therefore to search for ways to get more accurate numbers. But the goal of any such search is to estimate “excess mortality"—extra deaths, to put it bluntly—during the pandemic. The WHO itself has this definition: “The mortality above what would be expected based on the non-crisis mortality rate in the population of interest."

That is, let’s say that in a “normal" year in the tiny country called Freedonia, approximately 1,000 people die. But along comes an abnormal year, when the dreaded disease discombobulophelia spreads through the population, and there are 1,500 deaths that year. Then the excess mortality that year is 500, which Freedonians might reasonably attribute to the ravages of discombobulophelia. But is it that easy? Again as the WHO literature explains, reported covid numbers “miss those who died without testing, they are contingent on the country correctly defining covid as the cause-of-death and they miss the increases in other deaths that are related to the pandemic leading to overwhelmed health systems or patients avoiding care." All this complicates the process of pinning down this excess mortality.

Yet here’s the essence of it. First, get the count of all deaths ("all-cause mortality", or ACM) in a particular time period, say a given month. Second, use historical death data—i.e. from before the spread of the virus—to produce a number for the “expected deaths" in that same month. Then to get the excess deaths in a country in that month, you simply subtract the expected deaths from the all-cause mortality figure.

But of course this is easier said than done. Only 99 of 194 countries actually produced regular data about deaths during some or all of the pandemic. Most of Europe and the Americas were among that 99. For the other countries, the WHO scientists used a battery of “covariates" to let them estimate the actual death toll in different countries: temperature, human development index, covid death rates, a measure of lockdown restrictions, historical data on other common diseases, life expectancy and more. To give you an idea of the mathematical complexity of this exercise, here’s a line from the WHO’s 30-page paper that explains the methodology: “We resort to a relatively simple model in which we build an overdispersed Poisson log-linear regression model for the available monthly ACM data to predict the monthly ACM in those countries with no data."

But for some countries, India among them, the WHO extrapolated national numbers not from these factors, but from other available data. Such countries had no national ACM data for the pandemic period, but did have “subregional" data. In India’s case, that meant data from up to 17 of our states. These numbers were either officially reported, or were obtained by journalists who filed Right to Information queries and accessed Civil Registration System (CRS) data in different states. This was a worthwhile exercise because, for example, there are states in which “the CRS data showed large gaps between CRS-registered deaths for previous years and deaths for pandemic months, as well as a large gap between reported covid-19 deaths and observed mortality."

(shorturl.at/qE358) Madhya Pradesh and Andhra Pradesh had particularly large such gaps, but other states had them too. Such state-level data allowed WHO, via some intricate mathematical modelling, to come up with numbers for India as a whole. Yet if it is intricate, it builds on two assumptions that anyone can follow. One, that the proportion of deaths in a given state compared to the whole country stays relatively constant, and that this applies to all 17 states. Two, that there are no unexpected and significant changes over time in the population of a given state—apart, that is, from natural growth—and that this too applies to all 17 states.

In the absence of national data, are these reasonable assumptions to make? I’ll leave that for you to muse over. But using all these techniques, the WHO’s models suggest that the pandemic caused nearly 15 million deaths across the world in 2020-21 (nearly three times the reported number), and about 4.7 million in India alone.

The WHO’s methodology is not the only route to estimating the death toll. In a recent paper, Mihir Mahajan and Shekhar Sathe used insurance claims to come up with numbers ("Estimating the Impact of covid-19 in India from Life Insurance Claims", Economic and Political Weekly, May 21 2022). While data about death claims settled by the Life Insurance Corp. of India (LIC) and other insurers is publicly available, Mahajan and Sathe had to make some assumptions about it. For example:

* Claims are evenly distributed through the year. So figures reported for a financial year (April - March) can be aligned to the calendar year (January - December).

* Even though only a fraction of the population is insured, the ratio between death claims settled and deaths registered in a given year has been steady for several years at just under 25%. Thus Mahajan and Sathe could use that ratio “as a proxy to project missing death registrations in the pandemic period."

In 2020, insurance companies settled about 2.1 million death claims. Using the proxy ratio, this suggests that the expected number of deaths that year was about 8.68 million. But only 8.12 million deaths were registered that year, meaning that about 560,000 deaths were “likely missing" from data reported by the government.

Given that the government has not reported the count of registered deaths in 2021, Mahajan and Sathe had to use different techniques to estimate the toll for that year. They came up with a number that’s 4.15 million more than in 2019, the year immediately before the pandemic. In total, Mahajan and Sathe found that there were about 4.71 million “extra" deaths in India in 2020-21. That is, even though they used a totally different methodology, they came up with a number that’s strikingly close to the WHO’s estimate of 4.7 million. In both cases, that’s nearly 10 times India’s officially reported number of pandemic-related deaths, which is about half a million. I’ll leave that, too, for you to muse over.

Once a computer scientist, Dilip D’Souza now lives in Mumbai and writes for his dinners. His Twitter handle is @DeathEndsFun