Home/ Opinion / Views/  AI now not only debates with humans but negotiates and cajoles too

On 18 June, 2020, the world sat up and noticed how an artificial intelligence (AI) system had engaged in the first-ever live, public debates with humans. At an event held at International Business Machines Corp.’s (IBM) Watson West site in San Francisco, a champion debater and IBM’s AI system, Project Debater, began by preparing arguments for and against the statement: “We should subsidize space exploration". IBM later held a second debate between the system and another Israeli expert debater, Dan Zafrir, that featured opposing arguments on the statement: “We should increase the use of telemedicine."

In development since 2012, Project Debater was touted as IBM’s next big milestone for AI. Aimed at helping “people make evidence-based decisions when the answers aren’t black-and-white," it doesn’t just learn a topic but can debate unfamiliar topics too, as long as these are covered in the massive corpus that the system mines, which includes hundreds of millions of articles from numerous well-known newspapers and magazines. The system uses Watson Speech to Text API (application programming interface). Project Debater’s underlying technologies are also being used in IBM Cloud and IBM Watson.

You might also like 

How the new bill aims to protect your personal data 

5 charts tell the story of tech layoffs

This could be India's biggest Series A funding round 

This Mumbai couple’s 860 sq ft flat is the biggest they’ve rented so far

Interestingly, a year later at Think 2019 in San Francisco, IBM's Project Debater lost an argument in a live, public debate with a human champion, Harish Natarajan. They were arguing for and against the resolution, “We should subsidize preschool". Both sides had only 15 minutes to prepare their speech, following which they delivered a four-minute opening statement, a four-minute rebuttal, and a two-minute summary. The winner of the event was determined by Project Debater's ability to convince the audience of the persuasiveness of the arguments. But even though Natarajan was declared the winner, 58% of the audience said Project Debater "better enriched their knowledge about the topic at hand, compared to Harish’s 20%" ().

Raising the bar

Meta (formerly Facebook) appears to have gone a step further. On Tuesday, it announced that CICERO is the first AI "to achieve human-level performance in the popular strategy game Diplomacy". CICERO demonstrated this by playing on webDiplomacy.net, an online version of the game, where it achieved more than double the average score of the human players and ranked in the top 10% of participants who played more than one game. Marcus Tullius Cicero was a Roman writer, orator, lawyer and politician — all bundled in one.

Meta explains that unlike games like Chess and Go, Diplomacy requires an agent to recognize that someone is likely bluffing or that another player would see a certain move as aggressive, failing which it will lose. Likewise, it has to talk like a real person, displaying empathy, building relationships, and speaking knowledgeably about the game, failing which it won't find other players willing to work with it. To achieve these goals, Meta used both strategic reasoning as used in agents that played AlphaGo and Pluribus, and natural language processing (NLP), as used in models like GPT-3, BlenderBot 3, LaMDA, and OPT-175B.

Meta has open-sourced the code and published a paper to help the wider AI community use CICERO to "spur further progress in human-AI cooperation".

How CICERO works

CICERO continuously looks at the game board to understand and model how the other players are likely to act, following which it uses this framework to control a language model that "can generate free-form dialogue, informing other players of its plans and proposing reasonable actions for the other players that coordinate well with them". Meta started with a 2.7 billion parameter BART-like language model that is pre-trained on text from the internet and fine-tuned on over 40,000 human games on webDiplomacy.net. It also developed techniques to automatically annotate messages in the training data with corresponding planned moves in the game. The idea is to control dialogue generation while persuading other players more effectively. In short, Cicero first makes a prediction of what everyone will do; Second, it refines that prediction using planning; Third, it generates several candidate messages based on the board state, dialogue, and its intents; and fourth, it filters messages to reduce gibberish and unrelated comments.

AI-powered machines are being continuously pitted against humans in the last decade. IBM’s Deep Blue supercomputing system, for instance, beat chess grandmaster Garry Kasparov in 1996-97 and its Watson supercomputing system even beat Jeopardy players in 2011.

In March 2016, Alphabet-owned AI firm DeepMind’s computer programme, AlphaGo, beat Go champion Lee Sedol. On 7 December 2017, AlphaZero — modelled on AlphaGo — took just four hours to learn all chess rules and master the game enough to defeat the world’s strongest open-source chess engine, Stockfish. The AlphaZero algorithm is a more generic version of the AlphaGo Zero algorithm. It uses reinforcement learning, which is an unsupervised training method that uses rewards and punishments. AlphaGo Zero does not need to train on human amateur and professional games to learn how to play the ancient Chinese game of Go. Further, the new version not only learnt from AlphaGo — the world’s strongest player of the Chinese game Go — but also defeated it in October 2017.

A year later, in July 2018, AI bots beat humans at the video game Dota 2. Published by Valve Corp., Dota 2 is a free-to-play multiplayer online battle arena video game and is one of the most popular and complex e-sports games. Professionals train throughout the year to earn part of Dota’s annual $40 million prize pool that is the largest of any e-sports game. Hence, a machine beating such players underscores the power of AI. AI bots, though, lost to professional players at Dota 2, which has been actively developed for over a decade, with the game logic implemented in hundreds of thousands of lines of code. This logic takes milliseconds per tick to execute, versus nanoseconds for Chess or Go engines. The game is updated about once every two weeks.

What it means for humans

The approach of IBM's Project Debater and Meta's CICERO, though, lies in the fact that they involve predicting and modeling what humans would actually do in real life. This implies that they cannot be just relying on supervised learning, where the agent is trained with labeled data such as a database of human players’ actions in past games. Meta explains that CICERO runs an iterative planning algorithm called piKL, which "balances dialogue consistency with rationality".

CICERO, as Meta acknowledges, is a work in progress. As of now, it only capable of playing Diplomacy. However, the underlying technology is relevant to many real-world applications, Meta suggests. "Controlling natural language generation via planning and RL (reinforcement learning), could, for example, ease communication barriers between humans and AI-powered agents. For instance, today's AI assistants excel at simple question-answering tasks, like telling you the weather, but what if they could maintain a long-term conversation with the goal of teaching you a new skill? Alternatively, imagine a video game in which the non-player characters (NPCs) could plan and converse like people do — understanding your motivations and adapting the conversation accordingly — to help you on your quest of storming the castle.

It's clear from these developments that this is not the last we're hearing from AI-powered machines. The game will continue, and so will mutual learning.

Elsewhere in Mint

In Opinion, Raghuram G. Rajan says deglobalisation poses a climate threat. Vivek Kaul tells the reason why Twitter can't die. Madan Sabnavis calls for caution over India's title of the fastest-growing economy. Long Story says the slowed-down motorcycle is an eloquent sign of India's downturn.

Leslie D'Monte
Leslie D'Monte has been a journalist for almost three decades. He specialises in technology and science writing, having worked with leading media groups--both as a reporter and an editor. He is passionate about digital transformation and deep-tech topics including artificial intelligence (AI), big data analytics, the Internet of Things (IoT), blockchain, crypto, metaverses, quantum computing, genetics, fintech, electric vehicles, solar power and autonomous vehicles. Leslie is a Massachusetts Institute of Technology (MIT) Knight Science Journalism Fellow (2010-11). In his other avatar, he curates tech events and moderates panels.
Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
More Less
Updated: 23 Nov 2022, 03:21 PM IST
Recommended For You
Get alerts on WhatsApp
Set Preferences My Reads Watchlist Feedback Redeem a Gift Card Logout