Is Sanskrit the best language to program computers and AI?

Sanskrit largely follows Panini’s grammar. His Ashtadhyayi of 500 BCE has 3,976 rules governing spoken language. Photo: Mint
Sanskrit largely follows Panini’s grammar. His Ashtadhyayi of 500 BCE has 3,976 rules governing spoken language. Photo: Mint

Summary

  • As the word digitizes around us, learning how to code has become a passport to success.

In the past month, I came across three seemingly unconnected experiences that set me thinking about how humans and computers interact. The first was a conversation with a couple of stealth startups trying to figure out how to democratize coding by programming computers with natural spoken language. The second were some startling announcements by Big Tech firms on how artificial intelligence (AI)-generated large language models (MLMs) could programme machines and even AI models. But it was the third, an article by Sanjana Ramachandran in Fifty Two exploring a long-standing myth that the best language to program computers and AI is Sanskrit (bit.ly/3U0kY21) that provoked my curiosity.

Ramachandran quotes a variety of sources—Indian government officials, a motley bunch of academics and Indian-American author Rajiv Malhotra, who goes on to claim that Sanskrit should be credited with the last 20 years of development in Natural Language Processing (NLP), the technology behind prominent LLMs like GPT-3, DALL-E 2, etc. The claims are wide-ranging: Sanskrit is the most ‘scientific’ language, and so the “best to programme computers, or code AI/ML"; it is the “language for future super computers", etc. One common source that everyone cites, and which Ramachandran explores in detail, is “Nasa". Yes, the same Nasa that sends rockets into space. The reference actually has a published source, a 1985 paper ‘Knowledge Representation in Sanskrit and Artificial Intelligence’ by Nasa researcher Rick Briggs (bit.ly/3qrIjMr). Briggs writes, “Understandably, there is a widespread belief that natural languages arc unsuitable for the transmission of many ideas that artificial languages can render with great precision and mathematical rigor. But this dichotomy, which has served as a premise underlying much work in the areas of linguistics and artificial intelligence, is a false one. There is at least one language, Sanskrit, which for the duration of almost one thousand years was a living spoken language with a considerable literature of its own." He explains how it is the uniquely structured grammar and the word and sentence structuring properties of Sanskrit that appeal to how logic and structure-driven machines ‘think.’ His paper lays Knowledge Representation Schemes and describes how Sanskrit is best equipped to address these.

You might also like

A bumper September quarter for banks?

What drove the SBI stock above ₹5 trillion mark

How active is your actively managed mutual fund?

Policybazaar to avoid Zomato-like situation as early investors plan exit

Sanskrit largely follows Panini’s grammar, his Ashtadhyayi of 500 BCE has 3,976 rules governing spoken language. Dheepa Sundaram, quoted in Fifty Two, emphasizes how every classical Sanskrit word originates from about 2,000 base verbal roots, or dhatus, each derived from “distinct linguistic units—phonemes and morphemes—such that Ashtadhyayi functions as an algorithm." Fifty Two quotes Stanford professor Paul Kiparsky describing how in Sanskrit every sentence is “seen as a little drama played out by an Agent—the doer—“and a set of other actors which may include a Recipient, Goal, Instrument, Location and Source." What this means is that a sentence’s meaning “can be represented in these six basic categories, and by the relationships between them, independent of the actual words in it." This does sound very much how an AI would ‘think,’ especially Symbolic AI or GOFAI (Good Old Fashioned AI) which reigned before Deep Learning and Neural Networks muscled in. GOFAI was a rules-based, ‘top down’ AI system that required knowledge representation systems, and therefore the suitability of Sanskrit.

IIT professor Pawan Goyal believes Sanskrit works as a bridge language. Any other natural spoken language can be mapped to Sanskrit, which provides an ‘annotated format and exhaustive grammar’, and then this can then be used to program AI/ML. This bridge language concept is where Microsoft and OpenAI are coming from when they think of GPT-3 as being one. “If you can describe what you want to do in natural language, GPT-3 will generate a list of the most relevant formulas for you to choose from," said Microsoft CEO Satya Nadella. “The code writes itself." Microsoft has a billion-dollar investment in OpenAI, the GPT3 creator, and owns GitHub, the largest opensource code repository in the world. IBM is doing something similar with CodeNet, a dataset of 14 million code samples across 50 programming languages—an attempt to build the possibility of Natural Language Coding.

As the word digitizes around us, learning how to code has become a passport to success, much like knowing English was earlier. I often talk about ‘coding being the new English’, but can English be the new coding, or would that honour go to Sanskrit?

Jaspreet Bindra is the founder of Tech Whisperer Ltd, a digital transformation and technology advisory practice.

Elsewhere in Mint

In Opinion, Harsh V Pant says India should make the most of its geopolitical sweet spot. Sudipto Mundle argues why fiscal and monetary policies need to synchronise. Long Story tells how to fix a drowning Bengaluru.

 

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
more

MINT SPECIALS