Connect with us

Tech News

Google DeepMind’s SIMA AI agent can play video games its never seen before

Google LLC’s artificial intelligence research unit DeepMind today revealed it has been experimenting with a new kind of AI agent that can carry out multiple kinds of tasks in 3D video games that it has never played before.

The research organization has long been known for its achievements in games, building intelligent AI systems that can take on world champions in Go, Chess and Stratego. It has also built models that can learn how to play games without being taught the rules.

Since then, DeepMind has turned its attention specifically to video games, and its most powerful agent yet reportedly feels right at home in a variety of gaming worlds, where it can carry out numerous tasks based on instructions from humans.

DeepMind’s research team collaborated with a number of gaming studios on the research and trained the Scalable Instructable Multiworld Agent, known as “SIMA” on nine different games. In addition, they used four research environments, including one built with the 3D gaming engine Unity, to enhance SIMA. There, SIMA was tasked with forming sculptures out of building blocks, in a prior step that enabled it to learn how to adapt to various different video game settings, with different graphic styles and perspectives, such as first-person and third-person.

“Each game in SIMA’s portfolio opens up a new interactive world, including a range of skills to learn, from simple navigation and menu use, to mining resources, flying a spaceship or crafting a helmet,” the researchers wrote in a blog post.

The researchers said SIMA’s ability to follow directions and complete tasks in video game worlds could pave the way for more useful AI agents that can operate in real-world environments.

To teach SIMA, they first set about recording humans playing the video games, taking note of the keyboard and mouse inputs that were used. This information was fed into SIMA, which is based on a precise image-language mapping model and a video model that’s able to view a computer game being played on screen and predict what might happen next.

SIMA can comprehend a range of gaming environments, the researchers said, and then perform almost any task it’s asked to complete. What’s impressive is that SIMA doesn’t require access to the game’s source code. It simply plays commercial versions of the games, and requires just two inputs – the on-screen action and directions from a human user. It then plays the games using the same inputs as humans, namely a keyboard and a mouse.

DeepMind’s team evaluated SIMA’s performance on hundreds of basic gaming skills across categories such as navigation, menu-based tasks and object interaction. They then tested its abilities by training it to play one game before getting it to play the same title, using that as a baseline for its performance.

According to the researchers, a SIMA agent that was first trained on all nine games performed much better in a specific game than an agent that was only trained on that same game. That suggests it can leverage its experience acquired from playing other games to step up its performance.

They also trained a SIMA agent on eight games, and then tested it on the ninth game, which it had never come across before, and it did almost as well as an agent that had only been tested on that one game. “This ability to function in brand new environments highlights SIMA’s ability to generalize beyond its training,” the researchers pointed out. “This is a promising initial result, however more research is required for SIMA to perform at human levels in both seen and unseen games.”

The researchers found that SIMA does need a little guidance from humans to perform properly. A SIMA agent that wasn’t provided with any language training or instructions would not walk where it was told to, but rather just carried out common actions such as gathering resources, the researchers said. “[It] behaves in an appropriate but aimless manner,” the researchers said of such untrained and unguided agents.

DeepMind said the research shows there is potential in the idea of developing a “new wave of generalist, language-driven AI agents.” As AI models are exposed to more training environments, they can become more versatile and generalizable, the research suggests.

Holger Mueller of Constellation Research Inc. said video games have emerged as a key playground for AI as they’re ideal for simulating real world environments and can be used to test orientation, command and listening capabilities. “This is what DeepMind is doing with its multimodal SIMA agents, which can pick up and play new games instructed by voice commands alone,” Mueller said. “It’s especially interesting to discover that more experienced agents beat those with less experience, similar to humans in the real world. For DeepMind this is not about winning video games, but applying the lessons learned in gaming to the real world.”

Ultimately, DeepMind intends to create agents that can perform more sophisticated, multistage tasks based on natural language prompts. So eventually, a human might be able to tell an agent that’s playing a game such as Command & Conquer to gather some resources and build a base and a military force and go and destroy the opponent. At present, such a task is far too complex for SIMA agents.

“Ultimately, our research is building towards more general AI systems and agents that can understand and safely carry out a wide range of tasks in a way that is helpful to people online and in the real world,” DeepMind said.

Source :

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Copyright © 2022 Inventrium Magazine

%d bloggers like this: