
Google DeepMind has introduced the next generation of its gaming-focused artificial intelligence agent, known as the Scalable Instructable Multiworld Agent, or SIMA 2, on Thursday. The upgraded system builds on the first version launched in March 2024 and brings notable gains in reasoning, adaptability and user interaction. The company says the agent learns continuously and becomes more capable through its own play.
In its announcement, DeepMind highlighted that SIMA 2 can now reflect on its actions and think through the steps required to complete a task. The agent is powered by Google’s Gemini models and is designed to follow human-issued instructions, understand what has been asked, and plan its next moves based on the virtual environment it sees on screen.
The system receives visual input from a three-dimensional game world along with a user-defined goal, such as “build a shelter” or “find the red house”. It then breaks that goal into a sequence of smaller actions and performs them using controls similar to a keyboard and mouse.
According to the company, one of the most significant advances is SIMA 2’s improved ability to operate in games it has not previously encountered. DeepMind tested the agent in new environments such as Minedojo, a research adaptation of Minecraft, and ASKA, a Viking-themed survival game. In both cases, SIMA 2 delivered higher success rates than the earlier version.
The system also handles multimodal prompts, including sketches, emojis and a range of languages. It can apply concepts learned in one game to another. For example, an understanding of mining in a sandbox world can help it grasp harvesting in a different survival setting.
Google states that the second-generation agent is trained using a mix of human demonstration data and automatically generated annotations from the Gemini models. When SIMA 2 learns a new movement or skill in a fresh environment, that experience is captured and fed back into the training pipeline. DeepMind says this reduces dependence on human-labelled examples and allows the agent to refine itself over time.
Despite the progress, the system still has notable limitations. Memory of past interactions is restricted, long-range reasoning that requires many steps is difficult, and precise low-level control similar to robotic joint movements is not addressed in the current framework.
DeepMind stresses that SIMA 2 is not intended as a gaming assistant. Instead, the company views three-dimensional game worlds as a useful testing ground for AI agents that could eventually control real-world robots. The broader objective is to develop general-purpose machines that can follow natural language instructions and handle a variety of tasks in complex physical settings, highlights Google.
Catch all the Technology News and Updates on Live Mint. Download The Mint News App to get Daily Market Updates & Live Business News.
Oops! Looks like you have exceeded the limit to bookmark the image. Remove some to bookmark this image.