ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.00645
  4. Cited By
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning

FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning

2 June 2024
Yuwei Fu
Haichao Zhang
Di Wu
Wei-ping Xu
Benoit Boulet
    VLM
ArXivPDFHTML

Papers citing "FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning"

16 / 16 papers shown
Title
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Pengxiang Li
Zhi Gao
Bofei Zhang
Yapeng Mi
Xiaojian Ma
...
Tao Yuan
Yuwei Wu
Yunde Jia
Song-Chun Zhu
Qing Li
LLMAG
67
0
0
30 Apr 2025
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Shilin Zhang
Zican Hu
Wenhao Wu
Xinyi Xie
Jianxiang Tang
Chunlin Chen
Daoyi Dong
Yu Cheng
Zhenhong Sun
Zhi Wang
OffRL
31
0
0
21 Apr 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei K. Zhang
Bo Yang
Hua Chen
55
1
0
05 Mar 2025
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
Udita Ghosh
Dripta S. Raychaudhuri
Jiachen Li
Konstantinos Karydis
A. Roy-Chowdhury
VLM
52
0
0
03 Feb 2025
Online Preference-based Reinforcement Learning with Self-augmented
  Feedback from Large Language Model
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model
Songjun Tu
Jingbo Sun
Qichao Zhang
Xiangyuan Lan
Dongbin Zhao
67
1
0
22 Dec 2024
Robot Policy Learning with Temporal Optimal Transport Reward
Robot Policy Learning with Temporal Optimal Transport Reward
Yuwei Fu
Haichao Zhang
Di Wu
Wei-ping Xu
Benoit Boulet
OffRL
28
1
0
29 Oct 2024
Fostering Intrinsic Motivation in Reinforcement Learning with Pretrained
  Foundation Models
Fostering Intrinsic Motivation in Reinforcement Learning with Pretrained Foundation Models
Alain Andres
Javier Del Ser
OffRL
11
0
0
09 Oct 2024
SEAL: SEmantic-Augmented Imitation Learning via Language Model
SEAL: SEmantic-Augmented Imitation Learning via Language Model
Chengyang Gu
Yuxin Pan
Haotian Bai
Hui Xiong
Yize Chen
19
0
0
03 Oct 2024
Vision-Language Models Provide Promptable Representations for
  Reinforcement Learning
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
William Chen
Oier Mees
Aviral Kumar
Sergey Levine
VLM
LM&Ro
36
21
0
05 Feb 2024
LiFT: Unsupervised Reinforcement Learning with Foundation Models as
  Teachers
LiFT: Unsupervised Reinforcement Learning with Foundation Models as Teachers
Taewook Nam
Juyong Lee
Jesse Zhang
Sung Ju Hwang
Joseph J. Lim
Karl Pertsch
OffRL
LRM
40
2
0
14 Dec 2023
Vision-Language Models are Zero-Shot Reward Models for Reinforcement
  Learning
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Juan Rocamonde
Victoriano Montesinos
Elvis Nava
Ethan Perez
David Lindner
VLM
31
73
0
19 Oct 2023
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large
  Language Model Guidance
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance
Jesse Zhang
Jiahui Zhang
Karl Pertsch
Ziyi Liu
Xiang Ren
Minsuk Chang
Shao-Hua Sun
Joseph J. Lim
LLMAG
LM&Ro
79
57
0
16 Oct 2023
Vision-Language Models as Success Detectors
Vision-Language Models as Success Detectors
Yuqing Du
Ksenia Konyushkova
Misha Denil
A. Raju
Jessica Landon
Felix Hill
Nando de Freitas
Serkan Cabi
MLLM
LRM
82
76
0
13 Mar 2023
Defining and Characterizing Reward Hacking
Defining and Characterizing Reward Hacking
Joar Skalse
Nikolaus H. R. Howe
Dmitrii Krasheninnikov
David M. Krueger
57
53
0
27 Sep 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
203
627
0
12 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
1