Boston Dynamics’ four-legged robot Spot has been given a new skill: accurately reading analog thermometers and pressure gauges. This upgrade comes courtesy of Google DeepMind’s Gemini Robotics-ER 1.6 model, designed to enhance robots' 'embodied reasoning' in physical environments.
The new tech allows Spot to perform visual inspections using sight glasses, which provide a transparent window into tanks and pipes. The improvement is part of an ongoing collaboration between Boston Dynamics and Google DeepMind, aimed at making industrial facilities safer and more efficient through robotic inspection.
Such tasks require complex visual reasoning, interpreting multiple needles, liquid levels, container boundaries, and tick marks, as well as text. To handle these demands, the Gemini Robotics-ER 1.6 model provides robots with 'agentic vision'—a combination of visual reasoning and execution capabilities that create a 'visual scratchpad' for inspections and manipulations.
The agentic vision significantly improves accuracy in instrument reading tasks from 23% in the older Gemini Robotics-ER 1.5 model to an impressive 98%. Even without this capability, the baseline model can still achieve 86% accuracy through a process of pointing at different elements in visual images and multi-view reasoning, which uses multiple camera streams for better environmental understanding.







