Trouble viewing this email? View in web browser

Friday, December 02, 2022
By Leslie D'Monte

What does GPT-4 have in common with Rajnikant?

“Rajnikanth does not need a watch; he decides what time it is”; “There is no Ctrl key in Rajnikanth’s keyboard; he is always in Control”; “Rajnikanth knows the last digit of Pi”.

Many of us in India would be familiar with these jokes about the veteran actor Rajnikanth possessing superpowers. Rajnikanth is revered by many, especially in south India, for entertaining them with films that enthral and excite them. In 2010, for instance, the Rajnikanth-starrer Enthiran gave a glimpse into the life of ‘Chitti’--an artificial intelligence (AI)-powered humanoid robot that could fight, jump from one train to the other, clean, cook, and even fall in love. In other words, Chitti was a sentient humanoid who could think and respond intelligently and, more importantly, was self-aware.


Given the adoration for Rajnikanth, it’s not surprising that many of his fans may believe this to be possible. However, despite AI researchers reiterating that AI is nowhere close to becoming sentient, many still believe that scientists have indeed developed a sentient AI but are keeping it under wraps to avoid backlash from governments, philosophers, and activists. Achieving this goal is known as AI Singularity or Artificial General Intelligence, and crossing this barrier would mean that such an AI’s intelligence would surpass the most intelligent humans on earth, making it a sort of Alpha Intelligence that can call the shots and even enslave humans.

Picture Courtesy: Livemint

All this is fine, but what has this got to do with GPT-4, you may rightly ask?

The short answer is that though GPT-4 has not been released, people are attributing extraordinary powers to this large language model, giving rise to some hilarious comments on social media that remind us of the jokes around Rajnikanth’s imaginary superpowers and AI becoming self-aware.

Let’s sample a few:

Israel Gonzalez-Brooks @izzyz
“I’m about to mute the word “gpt-4”. The takes on it are so wildly incorrect that it makes my stomach hurt every time I see a new thread. People are baselessly speculating. They make unfeasible predictions. There are fundamental misunderstandings of how language models work.”

Charlie “Click” Greenman @razroo_chief
“Rumor has it that GPT-4 will solve world peace.”

“when asked, GPT-4 states that it is “not depressed, just lonely”

“Have you considered that these rumours you keep hearing about GPT-4 were written by GPT-5?”

Mark Tenenholtz @marktenenholtz

“Rumor has it GPT-4 will:

  • Turn water to wine
  • Transform any metal into gold
  • Still get beaten by XGBoost on tabular data”

Amid these sarcastic and witty tweets around GPT-4, here are a few points to remember:

  1. To begin with, no one is certain when GPT-4 will be released -- some say it will be by the end of this year, while others believe the likely date to be early 2023. But the exact date remains anyone’s guess for now.
  2. Andrew Feldman, Cerebras’ CEO, told Wired that “from talking to OpenAI, GPT-4 will be about 100 trillion parameters.” If that is true, GPT-4 would be over 500 times larger than GPT-3 (Generative Pre-trained Transformer 3), which had 175 billion parameters. This also implies that GPT-4 would have about the same number of parameters (connections) as synapses (connections between neurons) in the human brain, which has an estimated 125 trillion synapses.
  3. Sam Altman’s @sama Twitter response to the 100 trillion parameter figure on 22 November was, “y’all got no chill”.
  4. There may not be sufficiently good data to generate an efficient AI model with 100 trillion parameters

That said, what can we expect from GPT-4?

To understand this new LLM, let’s retrace a few steps. In June, a Google engineer claimed that the company’s AI model LaMDA had become sentient. Google, on its part, concluded that the engineer Blake Lemoine’s claims were reviewed by a team comprising Google technologists and ethicists but found to be hollow and baseless. It sent him on “paid administrative leave” for an alleged breach of confidentiality.

LaMDA, short for Language Model for Dialogue Applications, is a conversational, natural language planning (NLP) AI model that can have open-ended contextual conversations with sensible responses, unlike most chatbots. It is similar to languages like BERT (Bidirectional Encoder Representations from Transformers) with 110 million parameters and GPT-3. It is built on the Transformer architecture, a deep-learning neural network that Google Research invented and open-sourced in 2017. It produces a model that can be trained to read many words, whether a sentence or paragraph and then predict what words it thinks will come next. But unlike most other language models, LaMDA was trained on a dialogue dataset of 1.56 trillion words, which makes it understand the context and respond much better. For instance, our vocabulary and comprehension increase with reading more books -- and this is typically how AI models get better at what they do by more and more training.

Introduced in June 2018, GPT-1 used the BooksCorpus dataset to train on unseen data. It had 117 million parameters. GPT-2 is much more advanced and was trained on more than 10X the amount of data than its predecessor, GPT-1. Released in 2019 with 1.5 billion parameters, GPT-2 does require any task-specific training data (e.g. Wikipedia, news, books) to learn language tasks such as question answering, reading comprehension, summarization, and translation from raw text. The reason: data scientists can use pre-trained models and a machine learning technique called ‘Transfer Learning’ to solve problems similar to the one that was solved by the pre-trained model. For instance, the social media platform, Sharechat, pre-trained a GPT-2 model on a corpus constructed from Hindi Wikipedia and Hindi Common Crawl data to generate shayaris (poetry).

GPT-3 vastly enhances GPT -2’s capabilities. In a 22 July paper titled, ‘Language Models are Few-Shot Learners’, the authors describe GPT-3 as an autoregressive language model with 175 billion parameters. Autoregressive models use past values to predict future ones. GPT-3 can be used to write poems, articles, books, tweets, resumes, sift through legal documents and even translate or write code as well as, or even better than, humans. GPT-3 was released on 11 June 2020 by OpenAI—-a non-profit AI research company founded by Elon Musk (who resigned from the board but remained a co-chair) and others-—as an application programming interface (API) for developers to test and build a host of smart software products. Its earlier predecessor GPT-2, had 1.5 billion parameters (though a smaller dataset was released to avoid potential misuse) and was trained on a dataset of 8 million web pages. Parameters help Machine Learning (a subset of AI) models make predictions on new data. Examples include the weights in a neural network (called thus, since it’s loosely modelled on the human brain).

Humans typically learn a new language with the help of a few examples or simple instructions. However, they also can understand the context of the words. As an example, humans understand well that the word ‘bank’ can be used either to talk about a river or finance, depending on the context of the sentence. GPT-3 hopes to use this contextual ability and the transformer model (that reads the entire sequence of words in a single instance rather than word-by-word, thus consuming less computing power) to achieve similar results.

GPT-3 is undoubtedly an extremely well-read AI language model. A human, on average, could read about 600-700 books (assuming 8-10 books a year for 70 years) and about 125,000 articles (assuming five every day for 70 years) in his or her lifetime. That said, it’s humanly impossible for most of us to memorize this vast reading material and reproduce it on demand.

In contrast, the GPT-3 model has already digested about 500 billion words from sources like the internet and books (499 billion tokens, or words, to be precise, from sources including Common Crawl and Wikipedia). Common Crawl is an open repository that anyone can access and analyze. It contains petabytes of data collected over eight years of web crawling. Further, GPT-3 can recall and instantly draw inferences from this data repository.

GPT-3, of course, still has issues where it may generate nonsensical or insensitive text and create unnecessary headaches for those who deploy it. It can also be misused to create content that looks like human-written content and could spread hate and racial and communal bias. The authors of the GPT-3 paper themselves have acknowledged that the AI language model can be “misused to spread misinformation, spam, phishing, abuse of legal and governmental processes, fraudulent academic essay writing and social engineering pretexting by lowering existing barriers to carrying out these activities and increase their efficacy”.

The fact, however, remains that even if we account for the hype about GPT -4’s prowess and understand its limitations, we can expect this LLM to be a fairly huge leap over its predecessor, going by past records. But instead of speculating, watch this space for more developments.


Which is currently the world’s largest language model (LLM)?

  • GPT-3
  • MUM
  • Wu Dao 2.0
  • Dall-E
  • LaMDA

(The correct answer is given below)


IIT Roorkee researchers help capture black hole symphony

A team of researchers from the Indian Institute of Technology Roorkee (IIT Roorkee), part of InPTA, an Indo-Japanese collaboration of about 40 radio-astronomers working together with the International Pulsar Timing Array (IPTA) towards the detection of low-frequency gravitational waves--recently announced the first data release that stemmed from three-and-a-half years of observation using the upgraded Giant Metrewave Radio Telescope (uGMRT) operated by NCRA-TIFR near Pune. The universe is filled with gravitational wave background holding answers to deep secrets of nature. The waves that we detect now are strong but short-lived. Researchers are listening to large waves crashing loudly upon the seashore, whereas spacetime is continually brimming with tiny ripples.

The interplay of gravitational waves in the universe can be likened to a symphony played by nature. Researchers have been eavesdropping on the crescendos, while a persistent buzz forms the basis of this cosmic melody. These waves are generated by supermassive black hole binary pairs orbiting around each other for millions of years during their courses of collision. The primary hindrance in their detection is the vast ocean of interstellar medium lying in between. The InPTA data is critical for charting this interstellar ‘weather’ and paving the way to the discovery in the near future.

Prof. P Arumugam from the Department of Physics, IIT Roorkee, and his PhD student, Jaikhomba Singha, and IITR alumnus, Piyush Marmat, have been involved in this article which got published recently in the Publications of the Astronomical Society of Australia. Highlighting the importance of this research, Prof. Arumugam said, “This is an important release of our collaboration and will eventually help detect gravity waves in a new window.”

IIT Mandi develops a visual-based method to assess earthquake-prone structures in the Himalayas

Researchers at the Indian Institute of Technology Mandi (IIT Mandi) have developed a localized method to assess the ability of buildings to withstand earthquakes in the Himalayas, which are among the most earthquake-prone regions in the world because of an ongoing collision between the Indian and the Eurasian plates.

While earthquakes cannot be prevented, damage can certainly be limited by designing buildings and other infrastructure that can withstand seismic events. Given that it is neither physically nor economically viable to conduct a detailed seismic vulnerability assessment of every building, scientists use Rapid Visual Screening (RVS) of buildings to decide if a building is safe to occupy or requires immediate engineering work for enhancing earthquake safety. However, existing RVS methods are based on data from different countries and are not particularly applicable to the Indian Himalayan region because of some unique characteristics of the buildings in this region. For example, the Himalayan region (as with much of India) has many non-engineered structures.

It is, therefore, essential to use a region-specific RVS guideline that considers factors like local construction practices, typology, etc., which the IIT Mandi team has done. The findings of the research have been published in the Bulletin of Earthquake Engineering. The research was led by Sandip Kumar Saha, assistant professor at the School of Civil and Environmental Engineering, IIT Mandi, and co-authored by his PhD student Yati Aggarwal.

Through extensive field surveys, the researchers collected a large amount of data on the types of buildings present in the Mandi region of the Himalayas and the typical attributes present in these buildings that are connected to their earthquake vulnerability. A numerical study was also carried out to establish guidelines for counting the number of stories in hilly buildings for their RVS. Further, based on the vulnerable characteristics present in buildings, an improved RVS method was proposed. The methodology developed for screening buildings is a simple single-page RVS form which considers the various vulnerability attributes that are unique to the buildings in the case study region. Calculations made using these observations produce a seismic vulnerability score for buildings, which differentiates vulnerable buildings from the more robust ones, and allows better decision-making for maintenance and repair. The computation process is designed such that it minimizes the possibility of human bias or subjectivity of the assessor in scoring a building.

The answer to the Quiz:

c) In June 2021, Wu Dao 2.0 from the Beijing Academy of Artificial Intelligence broke GPT -3’s record with a multimodal model that was 10 times larger with 1.75 trillion parameters. It is trained on 4.9 terabytes of images and texts in both English and Chinese. As a multimodal AI model, Wu Dao 2.0 is not only a language model that generates text and speech but can also generate images and has self-improving learning capabilities. DALL-E is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions using a dataset of text–image pairs. MUM, which stands for Multitask Unified Model, is a multitasking and multimodal language model that is 1000x more powerful than its predecessor BERT. It has been trained in 75 languages and is multimodal, which means it can tackle text and image information and tasks like Wu Dao 2.0.

I hope you folks have a great weekend. And do remember, we welcome your feedback.

Please share your feedback with us

What do you think about this newsletter?

Loved it Loved it Meh! Meh! Hated it Hated it
Download the Mint app and read premium stories
Google Play Store App Store | Privacy Policy | Contact us You received this email because you signed up for HT newsletters or because it is included in your subscription. Copyright © HT Digital Streams. All Rights Reserved