There are close to 7,100 living languages in the world but as many as 90% of them are spoken by less than 100,000 people in the world, according to Ethnologue, an annual publication on world languages. In India, where there are more than 780 languages, over 220 languages have died in the past 50 years and 197 others are categorized as endangered by Unesco. Despite India’s cultural, geographical and linguistic diversity, only 22 languages have the official status in India.

So what happens to the rest in the next decade? Will they, too, simply vanish from the records like almost 1,500 languages did from the 1971 Census (1961 census had listed 1,652 mother tongues and the next census listed only 108 mother tongues)?

Even as India’s internet users and literacy rate are growing, 30% Indians are illiterate and many more can’t read or write in English, which makes up for more than 55.7% of the content available online. Interestingly though, as of 2016, India had 409 million internet users but only 175 million of them used the internet in English.

According to a recently released Google-KPMG report, Indian language user base grew at a compound annual growth rate of 41% between 2011 and 2016 to reach 234 million users at the end of 2016, surpassing the English users. With mobile penetration fast increasing in the country, Indian language internet users are expected to account for nearly 75% of India’s internet user base by 2021. Further, the high penetration of mobile phones has given them the opportunity to connect with their loved ones because speaking over the phone does not require one to know a script unlike writing letters.

Ethnologue’s 21st edition indicates that as many as 3,188 languages around the world are “likely unwritten". It means that these languages could have no script at all or “alphabets may exist but there may not be very many people who are literate and actually using the alphabet".

J.C. Sharma, in his paper Language and Script in India: Some Challenges, notes, “There are many unwritten languages spread over various regions in the country. No state is without the unwritten languages, and no state is without the minority people groups whose languages are yet to be systematically studied and writing systems provided." This has special relevance to India’s vast linguistic and cultural diversity. While Sharma goes on to propose a suitable script system for unwritten languages, I think digital tools and ICT gives us the power to document a language even in the absence of a script.

In a predominantly oral culture country like India, which is home to thousands of tribal communities and hundreds of languages, how do you preserve languages? How do you document languages that have only been sung in folk songs or narrated in folk stories without a script? How do you document the culture, traditions and history which are embedded in everyday life?

In the absence of digital tools and the internet in the past, documentation through text was the only medium of preservation of history, culture, art and even languages. Then came digital tools, and history, culture, art and even (scripts of) languages could be photographed and preserved. And now we have audio visual formats, which eliminate the need to have a script to document or preserve a language.

Linguists from National Geographic’s Enduring Voices project have already produced eight talking dictionaries to document struggling languages. Besides containing 32,000 word entries in eight endangered languages, the dictionaries hold more than 24,000 audio recordings of native speakers—many of who are among the last fluent individuals in their native tongues—pronouncing words and sentences, and photographs of cultural objects. The first project under this initiative was to initiate the documentation of Koro, a Sino-Tibetan language spoken by less than a thousand people in Arunachal Pradesh, in 2010.

This wonderful initiative is not the only one; there are cross-language open source tools to orally document pronunciations. Several linguists around the world are working on other similar efforts. However, these efforts by linguists and researchers are not enough. There are far too many unwritten languages and far too few efforts to document all of them.

There are thousands of endangered languages today, hundreds of them are spoken by less than 1,000 people, many are spoken by less than 100 native speakers. Ter Sami is a moribund dialect of Russia which has only two native speakers left today. Two! This language will be lost forever if not documented soon enough.

Unesco asserts that if nothing is done, half of 7,000-plus languages spoken today will disappear by the end of this century. It is thus imperative that more efforts be made to use audio-visual to document languages since this particular medium allows researchers and language enthusiasts to understand a language even if they don’t know the script. This difference in the medium of documentation (textual vs audio-visual) is what can truly preserve a language, even after its native speakers/writers are long gone.

Osama Manzar is founder-director of Digital Empowerment Foundation and chair of Manthan and mBillionth awards. He is member, advisory board, at Alliance for Affordable Internet and has co-authored NetCh@kra–15 Years of Internet in India and Internet Economy of India. He tweets @osamamanzar