The virtual worlds where AI is making its next big leap

There’s an almost unanimous belief among AI pioneers that world models are crucial to creating next-generation AI.  (Pexel)
There’s an almost unanimous belief among AI pioneers that world models are crucial to creating next-generation AI. (Pexel)
Summary

To develop knowledge beyond text and videos, AIs must have realistic virtual playgrounds where they can make mistakes and learn.

Today’s AIs are book smart. Everything they know they learned from available language, images and videos. To evolve further, they have to get street smart. That requires “world models."

The key is enabling AI to learn from their environments and faithfully represent an abstract version of them in their “heads," the way humans and animals do. To do it, developers need to train AIs by using simulations of the world. Think of it like learning to drive by playing “Gran Turismo" or learning to fly from “Microsoft Flight Simulator." These world models include all the things required to plan, take actions and make predictions about the future, including physics and time.

The world-model approach—which somewhat confusingly refers to both the simulated training environment and the abstract representation—is already having potentially huge effects on the real world. Drone warfare, new kinds of robots and safer-than-human self-driving vehicles all benefit from it, says Moritz Baier-Lentz, a partner and investor at Lightspeed, a venture-capital firm.

There’s an almost unanimous belief among AI pioneers that world models are crucial to creating next-generation AI. And many say they will be critical to someday creating better-than-human “artificial general intelligence," or AGI. Stanford University professor and AI “godmother" Fei-Fei Li has raised $230 million to launch world-model startup World Labs.

And Nvidia CEO Jensen Huang has said that world models can help unlock “physical AI" to autonomously direct robots, self-driving cars and the like.

While the type of AI that makes large language models and ChatGPT possible gets all of the attention right now, it’s world-model-based AI that is gaining momentum in frontier research and could allow technology to take on new roles in our lives.

It isn’t clear whether all these bets will lead to the superintelligence that corporate leaders predict. But in the short term, world models could make AIs better at tasks at which they currently falter, especially in spatial reasoning.

No matter how much data today’s generative AIs are trained on, they can only learn a probabilistic model of how the world works, says Gary Marcus, former head of Uber’s AI efforts and a frequent critic of current approaches to AI. At base, today’s AIs learn about the correlations between all the data fed into them—whether it consists of words and images, or molecules and their functions. This fuzzy approximation appears to be encoded in their AI “brains" as a mix of data and a giant list of rules about how to manipulate it, which are frequently incomplete or self-contradictory.

A good example of this: An Atari 2600 running a program from 1979 can beat cutting-edge chatbots at chess. The bots tend to attempt illegal moves and quickly lose track of the positions of their pieces. In essence, today’s transformer-based AIs are making predictions rather than reasoning logically. All this despite that they’ve been exposed through training to countless games and countless rulebooks. The Atari wins because it keeps the locations of pieces straight using an ancient and humble version of an internal world model: a database.

There are AIs that can beat an Atari—and any human alive—at chess. Google’s MuZero, released in 2019, was built in a substantially different way from the generative-AI bots that followed. It succeeded by learning how to create an accurate representation of the game it was playing.

But what about tasks that happen in the real world, which is far more complicated than the constrained world of a game? To tackle these challenges, Google DeepMind researchers set out to create a system that could generate real-world simulations with an unprecedented level of fidelity.

The result, Genie 3—which is still in research preview and not publicly available—can generate photo-realistic, open-world virtual landscapes from nothing more than a text prompt. You can think of Genie 3 as a way to quickly generate what’s essentially an open-world videogame that can be as faithful to the real world as you like. It’s a virtual space in which a baby AI can endlessly play, make mistakes and learn what it needs to do to achieve its goals, just as a baby animal or human does in the real world. That experimentation process is called reinforcement learning.

Genie 3 is part of a system that could help train the AI that someday pilots robots, self-driving cars and other “embodied" AIs, says project co-lead Jack Parker-Holder. And the environments could be filled with people and obstacles: An AI could learn how to interact with humans by observing them moving around in that virtual space, he adds.

There are already potentially gigantic sources of data for how people operate in virtual environments, such as Medal.tv, a service that captures both gameplay and users’ actions inside of videogames. These sources of data could prove especially useful—and valuable—to the various frontier AI labs that are trying to get us to AGI, and AIs that can pilot robots. Eventually, all this learning in virtual environments could lead to not only smarter chatbots but also systems that can safely operate in the real world.

Toronto-based Waabi constructed an entire world, called Waabi World, just to train AIs to drive trucks. It’s a lot safer (and cheaper) to let them crash over and over in a simulation than to try that even once in the real world. Raquel Urtasun, the company’s chief executive, says it allows AIs to log millions of virtual driving miles. Waabi’s software is expected to be able to autonomously pilot a real truck on a real road by the year’s end, she adds.

LLMs appear to be taking over some functions in white-collar jobs already, and AIs that are world-model smart could allow them to take over yet more. Blue-collar work has been relatively safe. But as AI developers go ever deeper on world models, robots could start applying for jobs as truck drivers, plumbers or caregivers.

Write to Christopher Mims at christopher.mims@wsj.com

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
more

topics

Read Next Story footLogo