Home/ Opinion / Columns/  The OpenAI vs. Google AI race has opened up Pandora’s bots

For a bit, Microsoft looked like it would eat Google’s lunch. Its languishing search engine Bing was being revolutionized with a hot new OpenAI chatbot. Those hopes have diminished as nobody— not even AI scientists—truly understands the breadth of capabilities of artificial intelligence once it’s unleashed. Early users of Bing reported unhinged, emotional, even threatening responses to some queries from the AI system, which called one user a “bad researcher" and told a writer that he was “not happily married." Bing, whose bot goes by the name Sydney, has put Google’s Bard error in the shade. However, these flaws are just the tip of an iceberg.

The tech behind chatbots like Bard and OpenAI’s ChatGPT comes from large language models (LLMs), computer programs trained on billions of words on the public internet that can generate humanlike text. If ChatGPT is a car, this model is its engine, and OpenAI has been selling access to it since 2020. But amid today’s arms race for search bots, those engines are also being shared freely and passing on their flaws.

OpenAI doesn’t disclose how many developers have accessed its LLM, GPT-3, but it’s likely in the hundreds of thousands. While there are dozens of free, open-source LLMs, OpenAI’s is seen as the gold standard. Given Google’s resources, its model LaMDA could soon prove just as popular. Google has kept its model under wraps for years, explaining that its reputation could suffer if launched prematurely. Yet, earlier this month, as Microsoft announced it would soon power Bing with GPT, Google seemed to reverse that position. Not only did it launch Bard the next day, it also said it would open access to LaMDA. This strategy could come to haunt Google, Microsoft and OpenAI, just as it did Facebook in 2018, when it was forced to shut access to user data after the Cambridge Analytica scandal. All it took was a rogue user.

One of the big risks is bias. Twitch shut down an animated spoof of Seinfeld which had animation and dialogue generated by AI, as the show’s characters had made transphobic and homophobic remarks. That dialogue was created by a “less-sophisticated version" of GPT-3.

GPT-3 was trained on billions of words from an array of sources including 7,000 unpublished books, Wikipedia entries and news articles, which left it vulnerable to picking up on biased or hateful material as well. OpenAI has used human moderators to strip much of that out of its model, but that work isn’t foolproof. Bias is also almost impossible to detect when it’s deeply buried in an LLM, a complex layered network of billions of parameters that acts like a black box even to its own creators. Misinformation also afflicts these models. Tech news site CNET generated 77 articles on financial advice last November using an LLM. It has to issue corrections on 41 of them. OpenAI doesn’t disclose what it calls the “hallucination rate" of its language models, but a January 2022 report on tech news site Protocol cited researchers as saying it was between 21% and 41%. My own experience of using ChatGPT puts misinformation at between 5% and 10%. Even if the rate is that low, companies using LLMs need to take everything the programs say with a huge grain of salt.

Misuse is perhaps the biggest unknown. OpenAI bans GPT-3 customers from using it to promote violence or spam. Perpetrators get a ‘content policy violation’ email, but bad actors could ignore all that. Stephane Baele, an associate professor in security and political violence at the University of Exeter, used GPT-3 to generate fake ISIS propaganda as part of a study last year. He recalls getting a request for an explanation from OpenAI, and replied to explain. “We said, ‘This is academic research’," he recalls. “We didn’t hear back."

OpenAI says it has stopped “hundreds" of actors attempting to misuse GPT-3 for a wide range of purposes, like disinformation, and is constantly tweaking its models to filter out harmful content. But there are other LLMs for bad actors to use.

In early 2019, OpenAI released a 70-page report on the social impact of language models, and said it would not release its latest LLM because it could be misused. That view has changed drastically since then. Sure, its language models have become more accurate and less biased, its safety filters more effective. But commercial pressures and the growing sway of Microsoft, which invested $1 billion in 2019 and another $10 billion this year into OpenAI, seem to have steered it toward making a riskier bet on commercializing its technology. Google, with its plans to sell access to LaMDA is now doing the same.

With Google’s stumble and Microsoft Bing’s bizarre comments, both companies need to slow down their AI arms race. Their revolutionary chatbots aren’t ready to go wide—and neither are the language engines powering them.

Parmy Olson is a Bloomberg Opinion columnist covering technology.


Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
More Less
Updated: 21 Feb 2023, 12:21 AM IST
Recommended For You
Get alerts on WhatsApp
Set Preferences My Reads Watchlist Feedback Redeem a Gift Card Logout