Google’s Gemini looks impressive but still trails

In the time it has taken Google to catch up with OpenAI, the nimbler player has had almost a year to work on its next AI model, GPT-5. (HT_PRINT)
In the time it has taken Google to catch up with OpenAI, the nimbler player has had almost a year to work on its next AI model, GPT-5. (HT_PRINT)


  • While Google's search business faces a threat from AI-powered aides, overtaking OpenAI's ChatGPT upgradations won't be easy.

The days between Thanksgiving and Christmas are normally a dead zone for launching new technology, but these are desperate times for Alphabet Inc’s Google.

The lumbering search giant was caught on the back-foot by ChatGPT one year ago, and it’s been eager to paint a picture of itself speeding ahead. Last Wednesday, it suddenly announced Gemini, a new AI model that could spot sleight-of-hand magic tricks and ace an accountancy exam. A demonstration video released by Google has wowed social media—but it’s a feat in spin. From a technical standpoint, Google is still chasing OpenAI.

Google released a table showing how Gemini ranks against OpenAI’s top model, GPT-4. It shows Gemini Ultra beating GPT-4 on most standard benchmarks. These test AI models on things like high school physics, professional law and moral scenarios and the current AI race is defined almost entirely by such capabilities.

But on most of the benchmarks, Gemini Ultra beat OpenAI’s GPT-4 model by only a few percentage points. In other words, Google’s top AI model has only made narrow improvements on something that OpenAI completed work on at least a year ago. And Ultra is still under wraps.

If it’s released in early January, as Google has suggested, Gemini Ultra might not stay the top model for very long. In the time it has taken Google to catch up with OpenAI, the nimbler player has had almost a year to work on its next AI model, GPT-5.

Then there’s the video demo that technologists described as “jaw-dropping" on X (formerly Twitter): On first viewing, this is impressive stuff. The model’s ability to track a ball of paper from under a plastic cup, or to infer that a dot-to-dot picture was a crab before it is even drawn, show glimmers of the reasoning abilities that Google’s DeepMind AI lab have cultivated over the years. That’s missing from other AI models. But many of the other capabilities on display are not unique and can be replicated by ChatGPT Plus, as Wharton professor Ethan Mollick has demonstrated.

Google also admits that the video is edited. “For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity," reads its YouTube description. This means the time it took for each response was actually longer than in the video.

In reality, the demo also wasn’t carried out in real time or voice. When asked about the video, a Google spokesperson said it was made by “using still image frames from the footage, and prompting via text," and pointed to a site showing how others could interact with Gemini with photos of their hands, or of drawings or other objects. In other words, the voice in the demo was reading out human-made prompts that were made to Gemini, and showing them still images. That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as the bot watched and responded in real time to the world around it.

The video also doesn’t specify that this demo is (probably) with Gemini Ultra, the model that’s not here yet. Fudging such details points to the broader marketing effort here: Google wants us remember that it’s got one of the largest teams of AI researchers in the world and access to more data than anyone else. It wants to remind us how vast its deployment network is by bringing less-capable versions of Gemini to Chrome, Android and Pixel phones.

But being everywhere isn’t always the advantage it seems in tech. Early mobile kings Nokia and Blackberry learnt that the hard way in the 2000s when Apple jumped in with the iPhone. In software, market success comes from having the best-performing systems.

Google’s show-boating is almost certainly timed to capitalize on all the recent turmoil at OpenAI. When a board coup at the smaller AI startup temporarily ousted CEO Sam Altman and put the company’s future in doubt, Google swiftly launched a sales campaign to persuade OpenAI’s corporate customers to switch to Google, according to a report in The Wall Street Journal. Now it seems to be riding that wave of uncertainty with the launch of Gemini.

But impressive demos can only get you so far and Google has displayed uncanny new tech before that didn’t go anywhere. Google’s gargantuan bureaucracy and layers of product managers have kept it from shipping products as nimbly as OpenAI till now. As society grapples with AI’s transformative effects, that’s no bad thing. But take Google’s latest show of sprinting ahead with a pinch of salt. It’s still coming up from behind. ©bloomberg

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.


Switch to the Mint app for fast and personalized news - Get App