New Delhi: This year’s Economic Survey does not carry the usual statistical tables on the economy’s performance, which are used widely by analysts.

This is probably because the budget has been advanced, and those tables will find place in the forthcoming volume slated for later this year. The survey seems to have compensated this by the use of Big Data and intensive data-mining of multiple datasets “to shed new light on the flow of goods and people within India".

The survey has used individual tax filings administered by the Goods and Service Tax Network to estimate state-level (both inter and intra) trade. Railway station-wise unreserved passenger traffic data provided by the Indian Railways has been used to arrive at estimates of work-related migration.

ALSO READ | What the Economic Survey reveals about Arvind Subramanian’s reading list

Satellite imagery has been used to calculate built-up area and estimate potential property tax collections (and hence losses being incurred currently).

Besides machine-generated large-scale data sets, even existing databases have been used more intensively. For instance, district-level estimates of the National Sample Survey Office (NSSO) statistics, which are the main source of employment and poverty/inequality statistics in India, have been used to generate insights on spatial concentration of poverty and welfare beneficiaries. Data from the Socio-Economic Caste Census (SECC) have also been put to similar use.

Ever since the start of the planning process, the official statistical machinery has largely focused on surveys almost to the point of neglecting administrative data (viz., data collected during routine administrative tasks). It is fitting that the end of the planning era should mark the beginning of a new chapter in which administrative data will be given pride of place in economic policy-making once again.

ALSO READ | India Economic Survey backs UBI, says demonetisation impact temporary

The new approach towards using diverse datasets is definitely an important first step towards better decision-making. Take the migration data using railway traffic for example. Census-based statistics for migration are still from the 2001 census, as detailed statistics for 2011 census have not been released till now. The railway traffic-based migration data is available till 2015-16. Such data, if made transparently available on a regular basis, can give useful insights on employment and distress-related scenarios for migrant workers, which currently rely on guesstimates based on figures such as demand for jobs in the MGNREGA.

The survey has used satellite imagery for built-up areas to estimate potential property tax collections. Marrying this data with something like income tax data for India’s top 50 cities and house-size census data can generate rich insights about our cities and their riches.

Still, it needs to be kept in mind that Big Data alone cannot be a silver bullet for India’s statistical challenges. For example, the informal sector continues to be a black hole when it comes to data. For several sectors and purposes, there are still no alternatives to better-designed and more intensive surveys.

The use of Big Data can however complement reforms in India’s traditional statistical machinery to help generate better data and frame more informed policies. If the new databases are cleaned and opened up (in a machine-readable format) for independent researchers to track, verify, and analyse, it could usher a new era of transparency and accountability.

For a start, the finance ministry should consider opening up the underlying data used in the survey in a machine-readable format.

Close