Ensuring wide data availability may help India develop its AI industry and avoid external dependence
Geo-political and economic power in the industrial age was determined by one’s expertise in manufacturing. In a digital society, it’s likely to be based on command over artificial intelligence (AI). As an information age takes over, hastened by the covid crisis, every country has to assess where it stands in terms of AI. This could decide its position in the global hierarchy.
AI is a not a separate sector. Rather, it’s expected to manage and lead every sector. There is mobility AI, education AI, health AI, agriculture AI, and so on. If every sector was earlier controlled by whoever had prowess over mechanical and chemical technology in that sector, it would now be led by whoever has the best AI for a given sector.
AI competence in every sector tends to get concentrated in one or two top companies. AI capabilities also work across sectors. Google’s parent company Alphabet, for instance, is into automobiles, media, health, education, travel and perhaps more. With an immense emerging global concentration of AI power, the field has basically become a race between the US and China. All other countries are fast being left behind. The International Monetary Fund’s chief recently expressed fears of a “digital Berlin Wall", with countries forced to pick a side.
To avoid becoming AI dependent, countries and regions like the UK, France, European Union (EU) and India are shaping their AI strategies towards establishing a strong domestic AI industry. AI has two elements: one is the technology itself, and the other is the social element of intensive, granular information about potential subjects of AI, or what we call data.
While also providing directions for technical development, AI strategies of most countries focus on widespread availability of data for the development of their domestic digital industries. The US occupies the top AI position, partly because its first-mover digital platforms have gained from network effects, turning many of them into global data monopolies. Only China has been able to match the US, because its internet firewall—first set up for political reasons—let local data companies emerge, develop quickly, and become globally competitive.
Kai-Fu Lee, AI scientist and businessman, wrote in The New York Times that all other countries will be “forced to negotiate with whichever country supplies most of their AI software—China or the United States—to essentially become that country’s economic dependent, taking in welfare subsidies in exchange for letting the ‘parent’ nation’s AI companies continue to profit from the dependent country’s users". Observers such as Elon Musk and the late Stephen Hawking have warned of the unprecedented dangers of a concentration of AI power.
What are the options for countries like India to retain AI independence and self-sufficiency? AI strategies cannot just hope for greater data sharing. There is no reason for global AI monopolies to voluntarily share their data hoards for the facilitation of domestic start-ups that would compete with them. Some form of mandatory sharing of data is thus being mulled in places like the EU, UK, France and Germany.
It was against this backdrop that a committee set up by India’s government on governing non personal data, led by Infosys co-founder Kris Gopalakrishnan, recently put out its draft report for public consultation. At the report’s core is an effort to ensure that non-personal data is actually shared and made available widely to enable the development of a strong domestic AI industry. The report characterizes data collected from a community or society as “community data", and asks for it to be shared for the community or society’s benefit. Such data should be available to the local AI industry.
Data being a highly valued resource, enforcing its sharing would require a legal basis. The panel has gone to a considerable length to develop a conceptual framework for this. Since such community data is about—and arises from—the community, it is considered to “own" it. All collectors of such data can only do so on the implicit condition that it will be made available to startups, if sought. This would be legally enforceable, thanks to “community ownership" of data.
Note that the report does not call for the sharing of data that is private to a business; rather, only such data need be shared that is collected from sources not owned by it, and is about others in the community. Further, all businesses, while having to share data, also get access to the data gathered by other businesses, which could be a considerable net gain. Digital businesses must shift from data hoarding as a key competitive advantage to devising innovative uses of widely-shared data for the benefit of consumers. All players could gain from such a shift, so would the Indian economy, and it could help India avoid an abject dependence for its AI needs on the two global AI superpowers.
In a digital age, being self-sufficient in terms of AI is central to any conception of an Atmanirbhar Bharat. This requires India’s data to be made widely available for use by the Indian AI industry. For this, the Gopalakrishnan panel seeks legislation and a new regulator. India has considerable technical capabilities in AI; data availability would enable a robust AI industry to emerge. Once the process is set in motion, positive feedback loops will keep improving AI technology as well as data availability. This is a reliable route to an India that is self-sufficient in AI.
Parminder Jeet Singh is a member of the Kris Gopalakrishnan committee on non-personal data governance and works with IT for Change, an NGO These are the author’s personal views.