v1v2 (latest)

Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning

19 October 2023

Juan Rocamonde

Victoriano Montesinos

ArXiv (abs)PDF HTML HuggingFace (20 upvotes)

Papers citing "Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning"

50 / 86 papers shown

Goal-Driven Reward by Video Diffusion Models for Reinforcement Learning

190

30 Nov 2025

Leveraging LLMs for reward function design in reinforcement learning control tasks

Franklin Cardenoso

Wouter Caarls

124

24 Nov 2025

AutoFocus-IL: VLM-based Saliency Maps for Data-Efficient Visual Imitation Learning without Extra Human Annotations

185

23 Nov 2025

Automated Reward Design for Gran Turismo

231

03 Nov 2025

World-in-World: World Models in a Closed-Loop World

...

283

20 Oct 2025

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

Roger Creus Castanyer

266

16 Oct 2025

CDE: Concept-Driven Exploration for Reinforcement Learning

131

09 Oct 2025

Zero-shot reasoning for simulating scholarly peer-review

Khalid M. Saqr

152

02 Oct 2025

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

160

29 Sep 2025

LAGEA: Language Guided Embodied Agents for Robotic Manipulation

Abdul Monaf Chowdhury

Akm Moshiur Rahman Mazumder

Rabeya Akter

S. Arib

LM&Ro

172

27 Sep 2025

OpenGVL -- Benchmarking Visual Temporal Progress for Data Curation

227

22 Sep 2025

CRAFT: Coaching Reinforcement Learning Autonomously using Foundation Models for Multi-Robot Coordination Tasks

216

17 Sep 2025

Human-Aligned Procedural Level Generation Reinforcement Learning via Text-Level-Sketch Shared Representation

141

13 Aug 2025

Policy Learning from Large Vision-Language Model Feedback without Reward Modeling

250

31 Jul 2025

GoalLadder: Incremental Goal Discovery with Vision-Language Models

Alexey Zakharov

Shimon Whiteson

331

19 Jun 2025

Reward Models in Deep Reinforcement Learning: A SurveyInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

206

18 Jun 2025

RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills

242

17 Jun 2025

Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models

230

15 Jun 2025

VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

Christos Ziakas

Alessandra Russo

TTA

352

11 Jun 2025

Truly Self-Improving Agents Require Intrinsic Metacognitive Learning

Tennison Liu

M. Schaar

AIFin LRM

438

05 Jun 2025

DriveMind: A Dual Visual Language Model-based Reinforcement Learning Framework for Autonomous Driving

Seunghyun Yoon

Hyuk Lim

Dan Dongseong Kim

Jin-Hee Cho

VLM LRM

228

01 Jun 2025

TeViR: Text-to-Video Reward with Diffusion Models for Efficient Reinforcement Learning

333

26 May 2025

Sample Efficient Reinforcement Learning via Large Vision Language Model DistillationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

353

16 May 2025

ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations

Jiahui Zhang

Yusen Luo

Abrar Anwar

Sumedh Anand Sontakke

472

16 May 2025

MA-ROESL: Motion-aware Rapid Reward Optimization for Efficient Robot Skill Learning from Single Videos

316

13 May 2025

TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with DemonstrationsIEEE International Conference on Robotics and Automation (ICRA), 2025

304

09 May 2025

VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making

401

06 May 2025

Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning

...

699

30 Apr 2025

PRISM: Projection-based Reward Integration for Scene-Aware Real-to-Sim-to-Real Transfer with Few Demonstrations

304

29 Apr 2025

Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision

1.2K

21 Apr 2025

A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future

428

12 Apr 2025

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical SkillComputer Vision and Pattern Recognition (CVPR), 2025

468

05 Apr 2025

Reward Generation via Large Vision-Language Model in Offline Reinforcement LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

388

03 Apr 2025

Option Discovery Using LLM-guided Semantic Hierarchical Reinforcement Learning

Chak Lam Shek

Erfaun Noorani

262

24 Mar 2025

LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning

1.2K

21 Mar 2025

Towards Automated Semantic Interpretability in Reinforcement Learning via Vision-Language Models

427

20 Mar 2025

PANDORA: Diffusion Policy Learning for Dexterous Robotic Piano Playing

388

17 Mar 2025

LuciBot: Automated Robot Policy Learning from Generated Videos

417

12 Mar 2025

Provably Correct Automata Embeddings for Optimal Automata-Conditioned Reinforcement Learning

Beyazit Yalcinkaya

Niklas Lauffer

Marcell Vazquez-Chanlatte

Sanjit A. Seshia

OffRL

428

06 Mar 2025

Enhancing Collective Intelligence in Large Language Models Through Emotional Integration

927

05 Mar 2025

Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning

378

03 Mar 2025

SENSEI: Semantic Exploration Guided by Foundation Models to Learn Versatile World Models

900

03 Mar 2025

Offline RLAIF: Piloting VLM Feedback for RL via SFO

Jacob Beck

OffRL

594

02 Mar 2025

Subtask-Aware Visual Reward Learning from Segmented DemonstrationsInternational Conference on Learning Representations (ICLR), 2025

265

28 Feb 2025

The Evolving Landscape of LLM- and VLM-Integrated Reinforcement LearningInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

504

24 Feb 2025

Imitation Learning from a Single Temporally Misaligned Video

503

08 Feb 2025

Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning

Udita Ghosh

Dripta S. Raychaudhuri

Jiachen Li

Konstantinos Karydis

Amit K. Roy-Chowdhury

VLM

294

03 Feb 2025

INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation

578

01 Feb 2025

LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward EnsembleConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

332

26 Nov 2024

Vision Language Models are In-Context Value LearnersInternational Conference on Learning Representations (ICLR), 2024

...

287

07 Nov 2024