Google's SOTA robotics model for visual & spatial reasoning!

Gemini Robotics-ER 1.6 is a vision-language model for robot reasoning. It handles spatial pointing, multi-view success detection, and instrument reading. For robotics engineers and developers building physical agents via the Gemini API.