Building the data economy in India

An information economy requires an attitudinal shift in society about the critical role of quality public data

In God we trust; all others must bring data.” This quote is widely attributed to the American statistician W. Edwards Deming, who helped to develop sampling techniques in the 1940s that are still used by the US department of the census. Since Deming’s time, the use of data for decision support for business and policy has exploded around the world.

A good example of this trend is the Billion Prices Project at Massachusetts Institute of Technology (MIT). This initiative scours the Web on a real-time basis for data from online retailers, and compiles and cleans these data to extract daily measures of inflation as well as cross-country real exchange rates. These numbers are a clever “new economy” aid to policy and decision-making, allowing “nowcasting” of critical information inputs at higher frequencies than have previously been possible. For example, these inflation data closely track and can predict end-of-period official numbers, leading to the ability to predict, say, the likely deflationary impact of the oil shock, and providing policymakers with the capability to respond swiftly.

How far is India on the journey towards an information-driven economy? It seems clear that debate, policy formulation, and business activity aren’t always as firmly grounded in the numbers as they ought to be. This is despite the fact that statistics about almost every aspect of public administration are likely to be critical inputs in what is perhaps the most ethnically diverse, geographically diverse, and politically complex fast-growing economy in the world.

A defensive response that one often encounters when raising this criticism is that India publishes a vast array of statistics if one is only willing to look. But can we indeed trust the data? Consider the premier resource for public data about India, . This is an admirable attempt by the National Informatics Centre (NIC) to curate and disseminate data from a wide range of government departments. NIC provides these data on the portal in usable form, and the portal is as slickly produced as that of any major economy. Nevertheless, however slick the interface, the quality and reliability of the underlying information relies on the inputs of the individual ministries and government departments. This is where there are serious issues.

A cursory investigation reveals poor quality and outdated resources in a number of cases. To take a simple example of the lack of availability of important data, it’s difficult to easily locate a long historical record of human development indicators for all states and Union territories in one easily accessible form (these can be found for 2011, for 15 “key states” relatively easily, but more recent updates seem hard to find). A simple search on the portal using the keyword “GDP” doesn’t immediately turn up any results, while the equivalent US government data portal yields over 150 hits. These are just a few examples of a wider problem with the supply of reliable public information. The lack of data standardization and collation complicates international or even inter-temporal comparisons, meaning that economic progress is difficult to measure. It is clear that measurement challenges in India are greater than those experienced in many economies. While this explains some of what we see, it also means that the benefits of good data are even higher. Moreover, this issue shouldn’t affect the production of standardized reports for the statistics we already have.

The timing of data releases is another important issue. Data should be updated using a standard reporting timetable that is religiously adhered to, but this rarely happens. Interruptions of service need to be taken seriously, and are a big problem in a modern market economy which relies on information as a critical input to business and economic decision-making in the same way as any other utility such as power or water.

These problems of Indian data should not be attributed solely to a failure of government. The production of information, like that of any other commodity, obeys the laws of supply and demand, and we need to look harder at whether we use public information as effectively as we should.

One group of frequent consumers of data and statistics is researchers at academic institutions and think tanks who are incentivized to produce high-quality publications based on rigorous empirical work.

Researchers based at foreign institutions, however closely linked they may be to India, will never provide sufficient demand on their own to change the dynamics of Indian public information production—only a large enough volume of high-quality domestic institutions can make this happen.

Are such incentives working sufficiently well in a large enough set of such Indian institutions? Are the people with the technical capability to process large volumes of data also those asking the right questions of the data? Are the people who are asking the right questions able to acquire or rent the technical capability to use data effectively? The answers to these questions also provide insights into the observed state of Indian public information resources.

To build the new information economy we so clearly need, we need to demand better public information supply. This won’t just require fixing incentives at information-consuming institutions, it will require a broader attitudinal shift about the critical role of high-quality public information in society.

Tarun Ramadorai is professor of financial economics at the Saïd Business School, University of Oxford, and a member of the Oxford-Man Institute of Quantitative Finance.

