ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.09526
  4. Cited By
Residual Q-Learning: Offline and Online Policy Customization without
  Value

Residual Q-Learning: Offline and Online Policy Customization without Value

15 June 2023
Chenran Li
Chen Tang
Haruki Nishimura
Jean-Pierre Mercat
M. Tomizuka
Wei Zhan
    OffRL
ArXivPDFHTML

Papers citing "Residual Q-Learning: Offline and Online Policy Customization without Value"

10 / 10 papers shown
Title
Residual Policy Gradient: A Reward View of KL-regularized Objective
Pengcheng Wang
Xinghao Zhu
Yuxin Chen
Chenfeng Xu
M. Tomizuka
Chenran Li
36
0
0
14 Mar 2025
Optimal Driver Warning Generation in Dynamic Driving Environment
Optimal Driver Warning Generation in Dynamic Driving Environment
Chenran Li
Aolin Xu
Enna Sachdeva
Teruhisa Misu
Behzad Dariush
46
0
0
09 Nov 2024
Residual-MPPI: Online Policy Customization for Continuous Control
Residual-MPPI: Online Policy Customization for Continuous Control
Pengcheng Wang
Chenran Li
Catherine Weaver
Kenta Kawamoto
M. Tomizuka
Chen Tang
Wei Zhan
OffRL
26
3
0
01 Jul 2024
MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from
  Intervention
MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention
Yuxin Chen
Chen Tang
Chenran Li
Ran Tian
Peter Stone
M. Tomizuka
Wei Zhan
21
1
0
24 Jun 2024
Guarded Policy Optimization with Imperfect Online Demonstrations
Guarded Policy Optimization with Imperfect Online Demonstrations
Zhenghai Xue
Zhenghao Peng
Quanyi Li
Zhihan Liu
Bolei Zhou
OffRL
37
10
0
03 Mar 2023
Planning with Diffusion for Flexible Behavior Synthesis
Planning with Diffusion for Flexible Behavior Synthesis
Michael Janner
Yilun Du
J. Tenenbaum
Sergey Levine
DiffM
202
622
0
20 May 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
212
832
0
12 Oct 2021
What Matters in Learning from Offline Human Demonstrations for Robot
  Manipulation
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
Ajay Mandlekar
Danfei Xu
J. Wong
Soroush Nasiriany
Chen Wang
Rohun Kulkarni
Li Fei-Fei
Silvio Savarese
Yuke Zhu
Roberto Martín-Martín
OffRL
147
461
0
06 Aug 2021
Large Scale Interactive Motion Forecasting for Autonomous Driving : The
  Waymo Open Motion Dataset
Large Scale Interactive Motion Forecasting for Autonomous Driving : The Waymo Open Motion Dataset
Scott Ettinger
Shuyang Cheng
Benjamin Caine
Chenxi Liu
Hang Zhao
...
Jiquan Ngiam
Vijay Vasudevan
Alexander McCauley
Jonathon Shlens
Drago Anguelov
129
528
0
20 Apr 2021
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
1