Photo: Hemant Mishra/Mint
Photo: Hemant Mishra/Mint

English, the Web and digital caste

By most estimates, only around 200 million Indians speak English, but almost all mainstream digital technologies are English-centric

In the middle of the Bihar elections, and numerous discussions around Digital India, I have to wonder if India is on the road to creating yet another caste—that of digital high priests who wield technology with ease only because they know a certain language—English.

By most estimates, only around 200 million Indians speak English (the Census of India website is maddeningly opaque on this topic, so estimates will have to do), but almost all mainstream digital technologies are English-centric. Where will that leave the other 800 million-plus Indians in the digital age? The answer to this question will determine whether language will be the creator of a new caste in a rapidly digitizing India.

Hindi is the fourth largest spoken language in the world, with around 400 million speakers, but it barely finds a presence on the Internet. The English Wikipedia has almost 3.7 million entries, while Hindi ranks at number 37 with 536,706 articles as on 20 October. Even a language like Basque, with an estimated 700,000 speakers, ranks higher than Hindi at number 36, with 573,338 Wikipedia entries.

For a country that claims to be an IT superpower, we Indians have been remarkably blind to the IT needs of our own people. Perhaps, the blame can be laid at the feet of an anglicized elite that is cut off from the realities of Bharat? However, things might be changing.

During the PC era, the high cost of computing devices, and the poor PC penetration in India, meant that demand for Indian language software was limited. The advent of smartphones has put a fairly capable computing device in the hands of millions.

According to a Cisco report, the number of smartphones grew 54% during 2014, reaching 140 million. It is projected to grow to more than four-and-a-half times that number between 2014 and 2019, reaching 651 million. During Prime Minister Narendra Modi’s recent visit to Silicon Valley, Google Inc. chief executive Sundar Pichai said the company will make it possible to use Android to type in 11 Indian languages.

Recently, companies focused on Indic computing have been successful in raising capital. Bengaluru-based Reverie Technologies recently received $4 million in venture capital funding from Qualcomm Ventures, while Gurgaon-based Process9 recently received an undisclosed amount from the Indian Angel Network. Micromax has made its Unite 3 phones available in 10 regional languages, while MakeMyTrip and other e-commerce sites are now making their apps available in Indian languages.

While these signs are encouraging, we cannot lose sight of the fact that there is a massive amount of work that needs to be done if we have to make using IT in Indian languages as easy as using it in English. The basic resources for Indic computing such as dictionaries, spellcheckers, fonts, parallel corpora (large and structured set of texts) that enable efficient translation systems, optical character recognition (OCR) software that enables the digitization of texts in Indian languages, etc., are not yet widely available.

The Indian government has spent crores of rupees on creating some of these resources through the Technology Development in Indian Languages (TDIL) programme. Some of these resources should be made available to India’s nascent Indic computing industry through open source licenses so that entrepreneurs can build on them and make Indian language computing technologies widely available.

While the government has the resources to fund the development of the basic infrastructural resources, it lacks the customer orientation required to bridge the “last mile" and take these solution to the market. On the other hand, the private sector in this industry is fragmented and lacks the resources to invest in building fundamental technologies. Therefore, to make Indic computing a reality, a true public-private partnership is required.

In many ways, it has been a surprise that the tech-savvy Bharatiya Janata Party-led government has not taken up this issue with greater vigour. After all, what use are government services delivered digitally, if the government does not speak the citizens’ languages? In fact, Indic computing should be at the very core of this government’s Digital India initiative.

While the challenges in making Indic computing a reality are huge, the opportunities are also huge. Many Indian languages are spoken by populations as large as some countries in Europe, but are completely underserved digitally. If these challenges are addressed successfully, it will bring millions of Indians into the digital mainstream, and that is a goal worth striving for.

The author is director of Alchemy Business Solutions Llp, a company that works in the area of technology for development, particularly in the area of open source software and Indic computing.

Close