On 22 February, India’s ministry of electronics and information technology (Meity) released its draft India Data Accessibility and Use Policy 2022 for public consultation. This is a continuation of earlier efforts to encourage better utilization of large-scale data collected by the government machinery. A need was felt to take advantage of data generated through routine administrative processes for the better delivery of public services. Such policies exist in many countries and an efficient use of such data will go a long way in improving services. Despite the demands of academia and other stakeholders, large volumes of such data have remained unutilized.

The draft policy is a step forward in realizing the potential of this large volume of data. Sharing it across various ministries and between central and state governments is a first step that can be taken easily. However, any data accessibility-and-use policy is incomplete without adequate public safeguards provided through a comprehensive data protection framework. Unfortunately, the progress on that front has been slow. The urgency of such a framework is all the more acute because the proposed policy suggests licensing of public-sector data on citizens to private entities. Other than issues of privacy and transparency, there are also issues of conflict of interest and misuse of such data for commercial or political purposes. At a time when data is “the new oil", monetization of valuable public sector data without adequate safeguards can be counter-productive, with implications for governance of public services and the privacy of individuals.

While the policy proposes greater openness and transparency in sharing public-sector data, this can contribute to policymaking only if data integrity is maintained and it can independently be verified. As public data is a by-product of government administration, its quality is only as good as that of the administration. To maintain the integrity of this data, it is essential to open databases for public scrutiny and academic analysis. Social audits could serve a purpose here. Provisions for this are in-built in programmes such as the one that is run under the Mahatma Gandhi National Rural Employment Guarantee Act. Its social audit has not only raised the quality of data available on this job programme’s functioning, but also helped improve the scheme itself. However, such a process has not been successful in many states, given the two-way relationship between administrative functioning and programme outcomes. Administrative control over data has also been used to thwart attempts by users and citizens to obtain data for public use. A good example of this is the Right to Information (RTI) Act, which has been diluted to a large extent over the past decade. Citizens’ attempts to obtain public data has even led to many RTI activists losing their lives.

Moreover, such data can at best be complementary to a systematic evaluation of administrative functioning and the efficacy of public services through independent surveys and research. Unfortunately, public data has often been used to discredit independent credible surveys, rather than complement them. Data from the Employee Provident Fund Organisation (EPFO) and E-Shram portal have been used to argue that jobs are being generated, as against separate evidence from the Periodic Labour Force Surveys of the National Statistical Office (NSO). Even though EPFO and E-Shram numbers indicate only job registration in government records (and thus formalization), these have been used to suit a political narrative. Similar attempts were made with other data collected by the NSO on indicators such as on open defecation, access to potable water, and so on. Recent years have seen an unprecedented assault on the credibility of NSO survey data, its consumption survey of 2017-18 formally rejected. Even a basic exercise such as our decennial population census has become political, with unnecessary attempts to link it with a National Population Register.

An essential part of our data policy should be to protect it from the very institution that generates it, which includes the administrative machinery as well as the political leadership. Our statistical system needs strengthening. An independent mechanism of evaluation and verification of public data is necessary for it to prove meaningfully useful. More so in cases where such data is closely linked to people’s access to essential public services. The policy will have little relevance unless safeguards are built in to protect privacy and the data is reliable enough for the purpose of holding the government accountable.

Himanshu is associate professor at Jawaharlal Nehru University and visiting fellow at the Centre de Sciences Humaines, New Delhi

