Data warriors driving the pandemic pushback

  • 2020 will have a profound impact on health, society and economics. What can we learn from those tracking it?
  • The government declared last month that no covid model has so far been able to predict the country’s disease trajectory accurately, but experts blame poor quality official data

Ajai Sreevatsan
Updated24 Jul 2020, 03:29 PM IST
From left: Tomás Pueyo, vice-president of growth at Course Hero, an online learning platform; Bhramar Mukherjee, a biostatistician at the School of Public Health, University of Michigan; and Amit Basole, a labour economist at Azim Premji University.
From left: Tomás Pueyo, vice-president of growth at Course Hero, an online learning platform; Bhramar Mukherjee, a biostatistician at the School of Public Health, University of Michigan; and Amit Basole, a labour economist at Azim Premji University.

In late February, roughly two weeks before the World Health Organization officially declared that the world was in the grip of a pandemic, San Francisco-based Tomás Pueyo began to obsessively track the journey of the novel coronavirus. “By then, this thing was already in 60 countries,” he told Mint.

One weekend in early March, when his wife began showing covid-like symptoms and had to isolate herself in a hospital, Pueyo decided to warn the world about what was coming. So, he sat down and wrote a blog post on Medium titled “Coronavirus: Why You Must Act Now”.

The data-heavy post with 13 charts would unexpectedly clock over 40 million views within a month. The first serious warning about the most consequential disease outbreak in over a century would come, for many, not via governments, but from a blogger who had no prior expertise in epidemiology or public health.

In fact, Pueyo has no particular background in science. He is a Silicon Valley growth hacker. He once created an app called Zoo World that helps people build virtual zoos. What Pueyo did understand, however, was data, thanks to a previous stint in mergers and acquisitions, he said. That and an insight into the viral growth of internet apps was enough to parse the early disease data coming out of China. “I knew what it looked like for something to grow at 10-15% day-on-day. I’ve experienced it,” he said. “I’m a nobody,” he added. “I was just this guy writing for his friends who happened to be in the right place at the right time.”

As the pandemic ripples through the world—turning esoteric concepts like doubling time and positivity rate into topics of daily conversation—scores of people have ended up in the right place at the right time. Some are statisticians or scientists; others economists; and a few are just interested volunteers. Relatively unknown academics like Lauren Gardner, who started one of the world’s first coronavirus trackers at Johns Hopkins University, or Bloomberg Economics’ chief economist Tom Orlik, who has been tracking the prospects of a global economic recovery, have been pushed into the spotlight suddenly.

In India, these efforts have become even more critical due to large gaps in official data. In Hyderabad, for instance, the location and number of containment zones is a state secret. Also, four months into the pandemic, there is still no centralized way to track all critical cases in the country, said Srujana Merugu, co-lead of Data Science India vs Covid, an informal coalition of current and former data scientists from Amazon, Google and Flipkart, among other firms.

“At the point of entry into a hospital, there is no unique id for a patient. So, even mere shifting between hospitals gets counted as a recovery sometimes,” she said.

While many Indian cities have resorted to highlighting the rising number of covid recoveries, Merugu, whose team has been volunteering to help several cities develop predictive models, said that recovery is a very fudgeable number. “When we use the official (recovery) figures, we come up with forecasts that don’t make sense.”

There are other data mysteries too. Small subsets of early data from Bengaluru and Chennai show that the average time between hospitalization and a covid-related death in India is just 2-3 days. The mean duration in the US is 15 days; in China, it is 7.5 days. “It’s very, very worrisome. This means people are seeking care very late,” Merugu said.

India is a global outliner

But there is no national attempt to collect, compile and track each death to understand what went wrong. Without data, India may never have a remedy.

However, it is in the realm of economic and social fallouts where the blind spots may be the worst. While there is some patchy information on jobs from private surveys, almost nothing quantifiable is known about incomes. Or hunger.

With government surveys largely at a standstill, over 20 private phone surveys are currently on to unravel at least small pieces of the puzzle. Rahul Sapkal, an assistant professor at the Tata Institute of Social Sciences who helms one of the phone surveys, said: “We will go to the public directly. Some of us will come together to create a private data repository to better explain what happened in 2020.”

Pulse of the pandemic

When Bhramar Mukherjee first wandered into the world of disease modelling, she was merely following in the footsteps of her Chinese colleagues at the University of Michigan who could talk about nothing but covid all through January and February. “They had been looking at the Wuhan data,” says Mukherjee, who is a biostatistician.

By mid-March, a group of US-based Indian academics had come together to set off a community data project: the COV-IND-19 Study Group. It is currently one of the most sustained modelling efforts looking at India from outside the country. “Usually, we work on math and nobody cares. But we have been chasing the pulse of the pandemic through data,” she said.

And the data shows India is likely to have nearly 2 million covid cases by 15 August, with no sign of a peak yet. “We are in this for the long haul,” Mukherjee said. India is going to experience a cascade of peaks that come in waves in different regions. In large countries where the disease has spread substantially, this is what we observe, she said. “We see the same pattern in the US.”

Modelling efforts have come under much criticism lately, with the Indian government declaring last month that no covid model has so far been able to predict the country’s disease trajectory accurately. But Mukherjee said this is partly a result of poor quality official data, which goes into the models as an input.

“For months, we have been looking for historical data on SARI (Severe Acute Respiratory Infection) and influenza-like illnesses in India. If there is any departure from the trend, something must be happening. It’s sort of like setting up an alert. But these data sets are not collected in India,” she said. Real-time information about fluctuations in SARI cases is a valuable input for correcting modelling errors. “In the US, you can go to the CDC website and see the trend with just one click,” she said. “Statisticians and modellers are not magicians. At least during the pandemic, India must filter out political bias and agenda and just give out data consistently,” Mukherjee added.

Curiously, India does have a government agency whose sole mandate is to identify and publish early signs of disease outbreaks at the district-level. That agency —the Integrated Disease Surveillance Programme—published its last weekly bulletin on 22 March. “The whole thing was set up in the aftermath of the SARS outbreak to handle situations like this. But they stopped right at the beginning of the pandemic,” said Thejesh G.N., a technologist and co-founder of the DataMeet community. “They’ve been publishing the bulletins for a decade, every single week. It’s very surprising,” he added.

Since no private person can replicate an official disease surveillance network, he has instead been trying to fill another void. Since early April, Thejesh and a few other researchers have been tracking non-virus deaths in India that can be directly attributed to lockdowns.

At least 300 people have died, he said. “This is most likely an underestimate. There were also similar hardships during demonetization, but nobody made a record of it. We didn’t want it to go unnoticed this time… that some people lost their lives to keep the rest of us safe.”

The recovery

The memory of demonetization is what Amit Basole, a labour economist at Azim Premji University, also invokes to explain the need for more phone surveys to get a better grip on the economic fallouts of the pandemic.

“In India, the formal sector has always been a proxy for the informal sector in the GDP figures. That’s why we didn’t quite see the effect of demonetization in the GDP numbers. The same thing is going to happen now. The economy may contract by 5%. But, in the informal sector, the contraction might be in the range of 30%. We may not know, because we are not measuring it,” Basole said.

The phone-based survey Basole helms is longitudinal, meaning, the same set of people will get repeat calls at regular intervals throughout the year in order to understand their evolving economic condition. “The idea is to look at the recovery process. It is a big unknown as to how an economy bounces back from such a crisis. These kinds of surveys are what are going to tell us.”

And there are already early indications of sweeping changes underway in India’s economic and social life. TISS’ Sapkal, who heads a large tracker survey with 11,000 participants, said there is evidence to indicate women who had just recently started to venture into male-dominated jobs may never get back their jobs. “Machine operators in textile units is a good example. This job has actually disappeared from the market for women.”

Globally, the disproportionate impact on women workers who dominate the hospitality and services sector is well documented. The unique emerging trend in India is this: only one in four Indian women over the age of 15 work outside the home and they had just started entering certain types of jobs in large numbers. In a weak job market, men are latching on to any available opportunity.

“The barriers are relatively lower in the informal sector. But since both members in the household have lost jobs, women are being pushed back into the household boundary,” Sapkal said. “After the pandemic, women may withdraw from the workforce in large numbers. This will have long-term consequences,” he added.

In the end, each country would have to find a viable path forward from 2020 on its own; the first few months of the pandemic has already shown the vast difference in capabilities and circumstances between countries, said Hannah Ritchie, head of research at “Our World in Data”, an online science publication associated with the University of Oxford.

Our World in Data’s extensive cross-country pandemic tracker has sent the site’s web traffic soaring, from around 2 million to 15 million daily visits, a large chunk of it from India. The most immediate priority would be to track the pandemic’s effect on climate change and global hunger, Ritchie said, adding, there might be very little good news.

“The pandemic has the capacity to set us back by many years,” she said. Good data will be crucial to understanding where we stand and where we can go from here, she added.

Stay updated with the latest developments on India Pakistan and Operation Sindoor . Get breaking news and key updates here on Mint!

Business NewsNewsIndiaData warriors driving the pandemic pushback
MoreLess