238

v1v2v3 (latest)

Gemini Robotics 1.5: Pushing the Frontier of Generalist Robots with Advanced Embodied Reasoning, Thinking, and Motion Transfer

2 October 2025

Saminda Abeyruwan

Jean-Baptiste Alayrac

Montserrat Gonzalez Arenas

Ashwin Balakrishna

Ashwin Balakrishna

Michael Bloesch

Konstantinos Bousmalis

Philemon Brakel

Thomas Buschmann

Arunkumar Byravan

Federico Casarini

London Chappellet-Volpini

José Enrique Chen

Hao-Tien Lewis Chiang

Adrian Collister

David B. DÁmbrosio

Meet Kirankumar Dave

Adil Dostmohamed

Debidatta Dwibedi

Sathish Thoppay Egambaram

Claudio Fantacci

K. Gopalakrishnan

Leonard Hasenclever

Leonard Hasenclever

Brandon Hernaez

Abhishek Jindal

Dmitry Kalashnikov

M. Emre Karagozler

Ksenia Konyushkova

Antoine Laurens

Sharath Maddineni

Anirudha Majumdar

Kevis-Kokitsi Maninis

Sergio Martinez

Niko Milonopoulos

Robert Moreno

Michael Neunert

Francesco Nori

Joy Ortiz

Kenneth Oslund

Carolina Parada

Emilio Parisotto

Amaris Paryag

Acorn Pooley

Thomas Power

Alessio Quaglino

Haroon Qureshi

Rajkumar Vasudeva Raju

Helen Ran

Dushyant Rao

Kanishka Rao

Isaac Reid

David Rendleman

Krista Reymann

Miguel Rivas

Francesco Romano

Yulia Rubanova

Peter Pastor Sampedro

Pannag R Sanketi

Dhruv Shah

Mohit Sharma

Kathryn Shea

Mohit Shridhar

Charles Shu

Vikas Sindhwani

Sumeet Singh

Radu Soricut

Rachel Sterneck

Ian Storz

Razvan Surdulescu

Jie Tan

Jonathan Tompson

Saran Tunyasuvunakool

Jake Varley

Grace Vesom

Giulia Vezzani

Maria Bauza Villalonga

Oriol Vinyals

René Wagner

Ayzaan Wahid

Stefan Welker

Paul Wohlhart

Chengda Wu

Markus Wulfmeier

Fei Xia

Ted Xiao

Annie Xie

Jinyu Xie

Peng Xu

Sichun Xu

Ying Xu

Zhuo Xu

Jimmy Yan

Sherry Yang

Skye Yang

Yuxiang Yang

Hiu Hong Yu

Wenhao Yu

Wentao Yuan

Yuan Yuan

Jingwei Zhang

Tingnan Zhang

Zhiyuan Zhang

Allan Zhou

Guangyao Zhou

Yuxiang Zhou

et al. (71 additional authors not shown)

ArXiv (abs)PDF HTML Github

Main:22 Pages

44 Figures

Bibliography:4 Pages

25 Tables

Appendix:36 Pages

Abstract

General-purpose robots need a deep understanding of the physical world, advanced reasoning, and general and dexterous control. This report introduces the latest generation of the Gemini Robotics model family: Gemini Robotics 1.5, a multi-embodiment Vision-Language-Action (VLA) model, and Gemini Robotics-ER 1.5, a state-of-the-art Embodied Reasoning (ER) model. We are bringing together three major innovations. First, Gemini Robotics 1.5 features a novel architecture and a Motion Transfer (MT) mechanism, which enables it to learn from heterogeneous, multi-embodiment robot data and makes the VLA more general. Second, Gemini Robotics 1.5 interleaves actions with a multi-level internal reasoning process in natural language. This enables the robot to "think before acting" and notably improves its ability to decompose and execute complex, multi-step tasks, and also makes the robot's behavior more interpretable to the user. Third, Gemini Robotics-ER 1.5 establishes a new state-of-the-art for embodied reasoning, i.e., for reasoning capabilities that are critical for robots, such as visual and spatial understanding, task planning, and progress estimation. Together, this family of models takes us a step towards an era of physical agents-enabling robots to perceive, think and then act so they can solve complex multi-step tasks.

Comments on this paper