The republic of statistical scramble
Straightening out data inconsistencies should be a government priority
Many a caustic word has been exchanged in the acrimonious debate over the Indian economy’s employment data. One set of numbers claims the current phase of economic growth as jobless. Alternative data sets have accompanied vigorous assertions of rising employment. And then there are many in the middle, trying to make sense of the scant (and outdated) data and wondering how anybody reached any conclusion at all.
Welcome to the republic of statistical scramble in the age of Big Data. The Bharatiya Janata Party’s (BJP’s) 2014 election victory was predicated partly on the promise of enhanced economic well-being; straightening out data inconsistencies should be a priority on the path to fulfilling that promise.
Take a look at labour data. Currently, employment data is collated from different surveys, each one measuring different things using varied methodologies. NITI Aayog’s task force on improving employment data recently released the first draft of its report, which lists how several arms of the government get involved in collecting and mashing up data. The report is unequivocal about the current state of data collection: “The available estimates are either out-dated or based on surveys with design flaws that render them unsuitable for inferring nationwide employment level.”
On the demand side, the National Sample Survey Organization (NSSO), in the ministry of statistics and programme implementation (Mospi), conducts a comprehensive household survey once every five years, with the last one occurring in 2011-12. The labour bureau in the ministry of labour and employment also conducts two household surveys—a quarterly quick employment survey and another on an annual basis. These are in addition to the decadal population census surveys, which measure two variables: a headcount of all types of workers at 10-year intervals and all non-agricultural enterprises, regardless of size.
On the jobs supply side, Mospi conducts a statutory annual industries survey for units registered under the Factories Act, 1948. NSSO also conducts an unorganized units survey; this is in addition to the micro, small and medium enterprises (MSME) census conducted by the MSME ministry. Finally, various government administrative bodies, such as the Employees Provident Fund Organization (EPFO) or Employees’ State Insurance Corporation (ESIC), provide some indication of organized sector employment trends (though this is being increasingly undermined by growing preference for contract labour). In addition, there are some private sector surveys also—for example, by the Centre for Monitoring Indian Economy.
All these measures suffer from some infirmity, whether it’s methodological, unviable sample size, inability to distinguish between different types of employment, long gaps or irregular frequencies. But one thing is common: the findings only provide a partial picture and are therefore useless as a tool for policy design. Part two of the Economic Survey says: “The lack of reliable estimates on employment in recent years has impeded its measurement and thereby the Government faces challenges in adopting appropriate policy interventions.”
The NSSO has, in the meantime, begun a fresh, ambitious annual exercise to map all nature of employment data; a quarterly survey will generate similar estimates for urban areas. In its report, NITI Aayog has recommended, among other things, vast improvements to existing surveys, institutional and legislative changes, overhauling physical and digital infrastructure and more aggressive use of technology to crunch the time-gap.
But the study might need to extend beyond employment data because statistical distortions also exist in other areas. NITI Aayog provides an example about the state of statistical confusion: each enterprise, while filing returns or statutory information, is assigned a different identification number under Good and Services Tax Network, EPFO, ESIC, Factories Act and Shops and Establishment Act.
This problem is not restricted to enterprise data and exists in other government departments as well. Take the example of estimating the cotton crop. Two separate ministries release two separate estimates every year.
The agriculture ministry’s cotton crop estimate for 2015-16 was 30.15 million bales of 170kg each, while the textile ministry’s estimate for the same year was 33.8 million bales—that’s a difference of 620 million kg! In the previous year, 2014-15, the estimates put out by the two ministries were 34.8 million bales and 38 million bales, respectively. This divergence seems bewildering, especially when acreage estimates from both the ministries broadly tally.
Forget discrepancies between ministries: this paper had reported (goo.gl/vsYGzc) how cotton yield figures differ widely within the agriculture ministry. There have also been reports (goo.gl/fd9adW) about vastly varying data on the number of taxpayers added since demonetization emerging from different parts of the government. Mismatch between data sets from within the government also breeds scepticism regarding the statistical robustness of national accounting, especially when anecdotal evidence seems contra to buoyant gross domestic product data.
India’s magnificent statistical heritage distinguishes the nation from its neighbours, whose growth record is often viewed with scepticism globally. This infrastructure needs an urgent overhaul to maintain credibility, perceive economic trends and deliver appropriate policy prescriptions.
Rajrishi Singhal is a consultant and former editor of a leading business newspaper. His Twitter handle is @rajrishisinghal.
Comments are welcome at firstname.lastname@example.org