ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.18719
  4. Cited By
VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning

VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning

24 May 2025
Guanxing Lu
Wenkai Guo
Chubin Zhang
Yuheng Zhou
Haonan Jiang
Zifeng Gao
Yansong Tang
Ziwei Wang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning"

50 / 72 papers shown
Title
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
Sule Bai
Mingxing Li
Yang Liu
Jing Tang
Haoji Zhang
Lei Sun
Xiaowen Chu
Yansong Tang
OffRLLRM
47
2
0
20 May 2025
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Zihan Wang
Kaidi Wang
Q. Wang
Pingyue Zhang
Linjie Li
...
Jiajun Wu
L. Fei-Fei
Lijuan Wang
Yejin Choi
Manling Li
276
30
0
24 Apr 2025
Understanding R1-Zero-Like Training: A Critical Perspective
Understanding R1-Zero-Like Training: A Critical Perspective
Zichen Liu
Changyu Chen
Wenjun Li
Penghui Qi
Tianyu Pang
Chao Du
Wee Sun Lee
Min Lin
OffRLLRM
234
172
0
26 Mar 2025
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
Zheng Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
...
Ya Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
OffRLLRM
251
217
0
18 Mar 2025
Improving Vision-Language-Action Model with Online Reinforcement Learning
Improving Vision-Language-Action Model with Online Reinforcement Learning
Yanjiang Guo
Jianke Zhang
Xiaoyu Chen
Xiang Ji
Yen-Jen Wang
Yucheng Hu
Jianyu Chen
VLM
117
13
0
28 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLMVLMOffRLAI4TSLRM
395
2,033
0
22 Jan 2025
FAST: Efficient Action Tokenization for Vision-Language-Action Models
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Karl Pertsch
Kyle Stachowicz
Brian Ichter
Danny Driess
Suraj Nair
Q. Vuong
Oier Mees
Chelsea Finn
Sergey Levine
158
70
0
17 Jan 2025
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning
Charles Xu
Qiyang Li
Jianlan Luo
Sergey Levine
OffRL
147
7
0
13 Dec 2024
ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models
ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models
Xubing Ye
Yukang Gan
Yixiao Ge
Xiao Zhang
Yansong Tang
172
11
0
30 Nov 2024
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale
  Reinforcement Learning Fine-Tuning
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Jiaheng Hu
Rose Hendrix
Ali Farhadi
Aniruddha Kembhavi
Roberto Martín-Martín
Peter Stone
Kuo-Hao Zeng
Kiana Ehsani
126
15
0
25 Sep 2024
Scaling LLM Test-Time Compute Optimally can be More Effective than
  Scaling Model Parameters
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
254
702
0
06 Aug 2024
Hierarchical Memory for Long Video QA
Hierarchical Memory for Long Video QA
Yiqin Wang
Haoji Zhang
Yansong Tang
Yong-Jin Liu
Jiashi Feng
Jifeng Dai
Xiaojie Jin
129
4
0
30 Jun 2024
PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful
  Navigators
PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators
Kuo-Hao Zeng
Zichen Zhang
Kiana Ehsani
Rose Hendrix
Jordi Salvador
Alvaro Herrasti
Ross Girshick
Aniruddha Kembhavi
Luca Weihs
LM&RoOffRL
78
25
0
28 Jun 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLMVLM
133
28
0
18 Jun 2024
OpenVLA: An Open-Source Vision-Language-Action Model
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
...
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&RoVLM
296
535
0
13 Jun 2024
Flash-VStream: Memory-Based Real-Time Understanding for Long Video
  Streams
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams
Haoji Zhang
Yiqin Wang
Yansong Tang
Yong-Jin Liu
Jiashi Feng
Jifeng Dai
Xiaojie Jin
112
45
0
12 Jun 2024
Octo: An Open-Source Generalist Robot Policy
Octo: An Open-Source Generalist Robot Policy
Octo Model Team
Dibya Ghosh
Homer Walke
Karl Pertsch
Kevin Black
...
Quan Vuong
Ted Xiao
Dorsa Sadigh
Chelsea Finn
Sergey Levine
216
452
0
20 May 2024
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Jian Hu
Xibin Wu
Wei Shen
OpenLLMAI Team
Dehao Zhang
...
Weikai Fang
Xianyu
Yu Cao
Haotian Xu
Yiming Liu
VLMAI4CE
136
130
0
20 May 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
...
Thomas Kollar
Sergey Levine
Chelsea Finn
Sergey Levine
Chelsea Finn
259
226
0
19 Mar 2024
Bootstrapping Reinforcement Learning with Imitation for Vision-Based
  Agile Flight
Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight
Jiaxu Xing
Angel Romero
L. Bauersfeld
Davide Scaramuzza
98
16
0
18 Mar 2024
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting
  Mitigation Problem
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Maciej Wolczyk
Bartłomiej Cupiał
M. Ostaszewski
Michal Bortkiewicz
Michal Zajkac
Razvan Pascanu
Lukasz Kuciñski
Piotr Milo's
CLL
153
18
0
05 Feb 2024
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning
Jianlan Luo
Zheyuan Hu
Charles Xu
You Liang Tan
Jacob Berg
Archit Sharma
S. Schaal
Chelsea Finn
Abhishek Gupta
Sergey Levine
OffRLOnRL
175
49
0
29 Jan 2024
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
Guanxing Lu
Ziwei Wang
Changliu Liu
Jiwen Lu
Yansong Tang
LRM
78
12
0
12 Dec 2023
Imitating Shortest Paths in Simulation Enables Effective Navigation and
  Manipulation in the Real World
Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World
Kiana Ehsani
Tanmay Gupta
Rose Hendrix
Jordi Salvador
Luca Weihs
...
Alvaro Herrasti
Ranjay Krishna
Dustin Schwenk
Eli VanderBilt
Aniruddha Kembhavi
86
23
0
05 Dec 2023
Qwen-Audio: Advancing Universal Audio Understanding via Unified
  Large-Scale Audio-Language Models
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Yunfei Chu
Jin Xu
Xiaohuan Zhou
Qian Yang
Shiliang Zhang
Zhijie Yan
Chang Zhou
Jingren Zhou
AuLLM
147
351
0
14 Nov 2023
Imitation Bootstrapped Reinforcement Learning
Imitation Bootstrapped Reinforcement Learning
Hengyuan Hu
Suvir Mirchandani
Dorsa Sadigh
125
27
0
03 Nov 2023
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment Collaboration
Abby OÑeill
Abdul Rehman
Abhinav Gupta
Abhiram Maddukuri
...
Zhuo Xu
Zichen Jeff Cui
Zichen Zhang
Zipeng Fu
Zipeng Lin
LM&Ro
190
531
0
13 Oct 2023
Efficient Memory Management for Large Language Model Serving with
  PagedAttention
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
209
2,340
0
12 Sep 2023
RoboAgent: Generalization and Efficiency in Robot Manipulation via
  Semantic Augmentations and Action Chunking
RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking
Homanga Bharadhwaj
Jay Vakil
Mohit Sharma
Abhi Gupta
Shubham Tulsiani
Vikash Kumar
LM&Ro
118
132
0
05 Sep 2023
Reinforced Self-Training (ReST) for Language Modeling
Reinforced Self-Training (ReST) for Language Modeling
Çağlar Gülçehre
T. Paine
S. Srinivasan
Ksenia Konyushkova
L. Weerts
...
Chenjie Gu
Wolfgang Macherey
Arnaud Doucet
Orhan Firat
Nando de Freitas
OffRL
136
309
0
17 Aug 2023
Scaling Relationship on Learning Mathematical Reasoning with Large
  Language Models
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
Zheng Yuan
Hongyi Yuan
Cheng Li
Guanting Dong
Keming Lu
Chuanqi Tan
Chang Zhou
Jingren Zhou
LRMALM
133
205
0
03 Aug 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
  Control
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&RoLRM
252
1,297
0
28 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
550
12,138
0
18 Jul 2023
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in
  One-Shot
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot
Haoshu Fang
Hongjie Fang
Zhenyu Tang
Jirong Liu
Chenxi Wang
Junbo Wang
Haoyi Zhu
Cewu Lu
128
78
0
02 Jul 2023
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
Bo Liu
Yifeng Zhu
Chongkai Gao
Yihao Feng
Qian Liu
Yuke Zhu
Peter Stone
CLL
188
157
0
05 Jun 2023
Let's Verify Step by Step
Let's Verify Step by Step
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
ALMOffRLLRM
251
1,241
0
31 May 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
405
4,190
0
29 May 2023
Causal Policy Gradient for Whole-Body Mobile Manipulation
Causal Policy Gradient for Whole-Body Mobile Manipulation
Jiaheng Hu
Peter Stone
Roberto Martín-Martín
120
24
0
04 May 2023
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Yanli Zhao
Andrew Gu
R. Varma
Liangchen Luo
Chien-chin Huang
...
Bernard Nguyen
Geeta Chauhan
Y. Hao
Ajit Mathews
Shen Li
FedMLMoE
126
352
0
21 Apr 2023
DINOv2: Learning Robust Visual Features without Supervision
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLMCLIPSSL
546
3,536
0
14 Apr 2023
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIPVLM
320
1,208
0
27 Mar 2023
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi
Zhenjia Xu
S. Feng
Eric A. Cousineau
Yilun Du
Benjamin Burchfiel
Russ Tedrake
Shuran Song
352
1,248
0
07 Mar 2023
Efficient Online Reinforcement Learning with Offline Data
Efficient Online Reinforcement Learning with Offline Data
Philip J. Ball
Laura M. Smith
Ilya Kostrikov
Sergey Levine
OffRLOnRL
150
184
0
06 Feb 2023
RT-1: Robotics Transformer for Real-World Control at Scale
RT-1: Robotics Transformer for Real-World Control at Scale
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Joseph Dabis
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
183
1,162
0
13 Dec 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online
  Videos
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
171
304
0
23 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Rameswar Panda
Marc G. Bellemare
OffRLOnRL
126
66
0
03 Jun 2022
Jump-Start Reinforcement Learning
Jump-Start Reinforcement Learning
Ikechukwu Uchendu
Ted Xiao
Yao Lu
Banghua Zhu
Mengyuan Yan
...
Chuyuan Fu
Cong Ma
Jiantao Jiao
Sergey Levine
Karol Hausman
OffRLOnRL
110
116
0
05 Apr 2022
STaR: Bootstrapping Reasoning With Reasoning
STaR: Bootstrapping Reasoning With Reasoning
E. Zelikman
Yuhuai Wu
Jesse Mu
Noah D. Goodman
ReLMLRM
160
512
0
28 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
1.1K
13,289
0
04 Mar 2022
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
Eric Jang
A. Irpan
Mohi Khansari
Daniel Kappler
F. Ebert
Corey Lynch
Sergey Levine
Chelsea Finn
LM&Ro
272
551
0
04 Feb 2022
12
Next