HeadlinesBriefing favicon HeadlinesBriefing.com

Gemini Robotics ER 1.6 Boosts Robot Spatial Reasoning

Hacker News •
×

Google has unveiled Gemini Robotics-ER 1.6, a major upgrade to its embodied reasoning model that significantly enhances robots' ability to understand and interact with the physical world. The new version improves spatial reasoning, multi-view understanding, and introduces instrument reading capabilities, enabling robots to interpret gauges, thermometers, and sight glasses with unprecedented precision.

Building on its predecessor, Gemini Robotics-ER 1.6 demonstrates substantial improvements in pointing accuracy, success detection, and visual reasoning. The model can now execute complex tasks by calling external tools like Google Search and vision-language-action models. Through collaboration with Boston Dynamics, Google discovered the critical need for instrument reading in facility inspections, leading to this new capability that allows robots to autonomously monitor industrial equipment.

Gemini Robotics-ER 1.6 represents Google's safest robotics model yet, showing superior compliance with safety policies on adversarial spatial reasoning tasks. The model is available today via the Gemini API and Google AI Studio, with developers able to access a Colab notebook for implementation examples. This advancement brings us closer to truly autonomous robots that can reason about their environments and perform complex real-world tasks without human intervention.