Home / Mint-lounge / Mint-on-sunday /  Wanderings in the world of Lingua Indica

That India is the home of many, many languages is something everybody knows. But how vast is it? That’s really a matter of conjecture. But in attempting to deal with this vastness, certain myths have come to dominate public perception.

Among the most persistent myths about languages in India is that Sanskrit is the ancestor of all Indian languages. This is as stubborn a myth as the other myth about Hindi being India’s national language. (It isn’t. The constitutional status of Hindi is that of an “official" language, along with English.)

The truth of the matter is that Sanskrit can rightly claim parentage over only one family of languages spoken in India—the Indo-Aryan languages spoken across a large swathe of India.

The Dravidian family of languages spoken in south India and in little pockets in northern and eastern India has borrowed Sanskrit vocabulary, but its grammar shows limited Sanskritic influence. In relation to the Dravidian languages, Sanskrit’s relationship could perhaps be termed as that of a step-mother or, more rightly, as a close neighbour.

But—besides these two families of languages, which command the brute majority of speakers—there are three more families of languages spoken in the country.

One of these is the Austro-Asiatic family of languages, which Gopal Haldar in his book The Languages of India also terms the “Nishada" languages, is another important family of languages which might justifiably called India’s “original" language group since, by some accounts, it predates both the Dravidian and Indo-Aryan language groups. Related to Khmer and Vietnamese, this family of languages is markedly different from the other language families.

Khasi (spoken in Meghalaya), Munda (spoken in Jharkhand) and Santhali (spoken in West Bengal, Bihar, Assam, Odisha and Jharkhand, besides in neighbouring Nepal and Bangladesh) are among most widely spoken Austro-Asiatic languages in India.

The Santhali language is India’s largest tribal language and is spoken by close to 6 million people. Until 1925, this language did not have a written form. Pandit Raghunath Murmu (1905-82), known as Guru Gomke among the Santhals, an extraordinary personality by all accounts, then created a script for the language. This script, known as Ol Chiki, is markedly different from the Devanagiri school of scripts and is widely used today.

Further east from Santhal country, in Manipur, the old Meetei Mayek script is undergoing a revival. The script had been proscribed in the early 18th century with the advent of Vaishnavism in Manipur and the Puyas, written documents in Meetei Mayek on various matters, were burned.

The Bengali script was then used to write Meeteilon (Manipuri). Presently, the Meetei Mayek has been reintroduced in schools, infusing fresh life into the script.

In Arunachal Pradesh, a unique experiment of sorts is underway. A script called the Tani Lipi has been created as a single script for the various tribal languages (26 at last count). The creation of this script (by Tony Koyu) is essentially an attempt to record indigenous tribal knowledge. Not since the Roman script has one script been shared by so many different languages.

The languages of Arunachal Pradesh and Meeteilon are part of the Sino-Tibetan (sometimes called Tibeto-Burman) family of languages which is the fourth family of Indian languages.

While Burmese is the most widely spoken language in this family, in the Indian subcontinent, the largest spoken languages are Meeteilon (in India) and Newari (in Nepal). The Naga languages, too, belong to this group.

Newari, the classical language of Nepal, has been displaced today by Nepali. Under Rana rule, Newari was formally suppressed and its writers and users imprisoned. Since the 1950s though, Newari has made a gentle comeback, although the numbers of its speakers have fallen as many people opt to speak in the dominant Nepali.

Another important Tibeto-Burman language is Kokborok, the official language of the state government of Tripura. First written in the extinct Koloma script and later in the Bengali script, Kokborok has now opted for the Roman script.

A fifth family of languages has been identified and classified only recently—the Andamanese.

Great Anadamanese and Ongan (spoken by the Onge tribe) are confirmed members of this family. The Sentinelese language is believed to be a member too. But since the Sentinelese are an uncontacted tribe, this is hard to confirm.

Some scholars even speak of a sixth language family existing in India. This is the Tai-Kadai (sometimes Kra-Dai) language family, consisting of among others, Thai and Lao, the languages of Thailand and Laos. In India, a couple of languages spoken in Arunachal Pradesh and Assam are believed to belong to this family.

The big banyan of India’s languages reveals many more treasures on closer inspection. One such treasure is a language called Lingua da Casa or Daman and Diu Indo-Portuguese (two tongues in reality, but spoken of as one for reasons of simplicity).

Daman Indo-Portuguese appears to be a creole of Marathi and Portuguese, whereas Diu Indo-Portuguese appears to be a creole of Gujarati and Portuguese. Widely spoken in the past, it was first documented in the 19th century by German linguist Hugo Schuchardt. With barely a few hundred speakers today, this is a language biding its time before extinction.

Of similar vintage is the language of Korlai Indo-Portuguese, spoken by about 1,000 Luso-Indian Christians in and around the village of Korlai in Maharashtra’s Raigad district. The language is also known as Kristi (Christian), Korlai Creole Portuguese, Korlai Portuguese and Nou Ling.

(Luso-Indian was once a term used by people of mixed Portuguese and Indian ancestry. The term has largely been replaced by Anglo-Indian, though strictly speaking that term ought to apply only to Indians of mixed English and Indian ancestry.)

Two more distinct languages also bear mention here. One is Byari Bhashe (sometimes Beary), a language close to Malayalam and Tulu, and influenced by Arabic. The language also has words related to Tamil.

Spoken by the Muslim community having its roots in the Dakshina Kannada district of Karnataka, speakers of this language can be found scattered all along the Malabar and Konkan coasts.

The word beary is said to be derived from the Tulu word byara, which means trade or business. Another popular theory is that beary comes from Arabic word bahar. Bahar means “ocean" and bahri in Arabic means “sailor" or “navigator". A third theory says beary is derived from the word “Malabar".

In Sugata Srinivasaraju’s delightful book Pickles from Home: The Worlds of a Bilingual, one essay speaks of a “language with no name". Spoken by Brahmins from a couple of villages in Kolar district in Karnataka, this language borrows in equal measure from Tamil, Kannada and Telugu. It sounds somewhat like all three, and yet sometimes like none of them.

Christened Engalode Vathe (our speech) by its speakers, it is a language that owes its origin to geography—Kolar is located at the intersection of Karnataka, Tamil Nadu and Andhra Pradesh.

As is evident, the banyan of Lingua Indica is a capacious one. Accommodating a variety of languages and scripts, it is a veritable treasure trove of diversity. To walk in it is to be in a world of untold richness.

Karthik Venkatesh is an editor with a publishing firm and a freelance writer. Views are personal.

Comments are welcome at

Subscribe to Mint Newsletters
* Enter a valid email
* Thank you for subscribing to our newsletter.

Never miss a story! Stay connected and informed with Mint. Download our App Now!!

Edit Profile
My ReadsRedeem a Gift CardLogout