Home / Opinion / Views /  Pandemic life and death: WHO’s estimate is it anyway?

The World Health Organization (WHO) recently estimated that approximately 4.7 million Indians died due to covid-related causes in the years 2020 and 2021. While these headline estimates, which are several orders of magnitude higher than India’s official estimates, have hogged the limelight, the true worth of a statistical estimation exercise lies in its methodology.

Broadly speaking, the WHO estimates the expected mortality (monthly) for 2020 and 2021 based on historical data, compares it to actual all-cause mortality (ACM) for 2020-2021 and interprets the difference as excess mortality due to covid. This is fine in theory, except for the fact that for India, the WHO does not have monthly ACM data for 2020-2021. To get around this “data gap", the WHO “estimates" monthly ACM for India.

Thus, the first point to note is that WHO estimates of excess deaths are themselves based on estimates and not observed data. Moreover, the WHO uses different models for estimating ACM for different countries. In the absence of evidence to the contrary, which the WHO does not provide, this renders a cross-country comparison of ACM and by extension excess deaths completely uninformative.

The second point to note is that the WHO uses data from 17 states to estimate national ACM for India. However, the source of this data remains shrouded in mystery. The WHO does not disclose which states are part of the sample. An Indian government press release has stated that this data is unverified. The WHO admits that the data may not be regarded as statistics “officially produced by India"; it also admits that besides official statistics, it also uses data provided by “journalists who obtained death registration information through Right To Information requests." From a statistical perspective, this is problematic because, as the WHO admits, different sources process and record mortality data differently. This means that splicing data from different sources can lead to systematic biases in the sample.

Thus, it is important to note that the 4.7 million number is an estimate which itself is based on estimates which are based on data that is unverified.

The third problematic aspect of the WHO’s estimation exercise is the statistical methodology it uses to estimate ACM for India from state-level data. The methodology it used is based on Karlinsky (2022), but much more complex. This is because Karlinsky uses one province (Cordoba) to estimate national ACM for Argentina, while the WHO uses 17 Indian states to estimate the national ACM for India. This is compounded by the fact that the number of states for which data is available varies by the month. Not only does the WHO not disclose which states are present in its sample, it does not say which monthly observations have been taken.

Thus, in essence, the WHO’s estimate based on estimates was drawn from unverified data obtained using a methodology that makes an assumption for India whose validity is suspect, at the very least.

Delving deeper into the methodology yields even more disturbing issues of implementation. The WHO accepts that the accuracy of its statistical methodology depends on two critical assumptions. First, that the distribution of the pandemic over time is similar for the states in its sample and the country as a whole. Second, the states’ share of total deaths in the country is stable both historically and throughout the pandemic.

While Karlinsky painstakingly documents this in his study, the WHO provides no evidence that these assumptions hold for the data it uses for India. Unlike Karlinsky, the WHO does not even present a rudimentary graph showing that the states in the sample went through peaks and troughs of the pandemic at the same time as the entire country. This is critical. Given the geographical spread of India and given that the spatial and temporal distribution of the pandemic varied quite widely across Indian states, using state- level data that does not mirror the pandemic’s national spread could inflate ACM estimates.

What is even more surprising is that the WHO claims to have validated this model in a simulation —for which it provides no details—but did not do it empirically, even though it is simpler and easier to validate it empirically. One, it could have implemented this model on countries for which both sub-national and national data is available and verified if the model-implied national ACM matches the actual observed ACM. Two, the WHO could have implemented the model on pre-pandemic sub-national and national data from India and verified if the model- implied ACM matches the reported pre-pandemic ACM for India.

This study does not constitute an isolated case of statistical adventurism. It is part of a sequence of sordid events, including the WHO’s intransigence over Covaxin, slovenly investigation of the origins of covid-19, its consistent undermining of data from the Indian government, an endless procession of ‘Indian origin’ ‘scientists’ on television flogging their ‘scientism’ with inaccurate forecasts, and lastly, ravenous hordes of ‘independent’ journalists scouring crematoriums and burial grounds with cameras and drones, ruthlessly trampling on the grief of shattered families in search of those elusive dead bodies that could prove massive undercounting of the dead in India.

These are the authors’ personal views.

V. Anantha Nageswaran & Diva Jain  are, respectively, chief economic advisor to the Government of India, and director at Arrjavv.

Subscribe to Mint Newsletters
* Enter a valid email
* Thank you for subscribing to our newsletter.
Recommended For You
Edit Profile
Get alerts on WhatsApp
Set Preferences My ReadsFeedbackRedeem a Gift CardLogout