ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04779
  4. Cited By
Conservative Q-Learning for Offline Reinforcement Learning

Conservative Q-Learning for Offline Reinforcement Learning

8 June 2020
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
    OffRL
    OnRL
ArXivPDFHTML

Papers citing "Conservative Q-Learning for Offline Reinforcement Learning"

50 / 388 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
33
1
0
16 May 2025
Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer
Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer
Minh Hoang Nguyen
Linh Le Pham Van
Thommen George Karimpanal
Sunil Gupta
Hung Le
OffRL
LRM
37
0
0
14 May 2025
What Matters for Batch Online Reinforcement Learning in Robotics?
What Matters for Batch Online Reinforcement Learning in Robotics?
Perry Dong
Suvir Mirchandani
Dorsa Sadigh
Chelsea Finn
OffRL
31
0
0
12 May 2025
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
Jake Grigsby
Yuke Zhu
Michael S Ryoo
Juan Carlos Niebles
OffRL
VLM
41
0
0
06 May 2025
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu
Sili Huang
Z. Yang
Shengchao Hu
Li Shen
H. Chen
Lichao Sun
Yi-Ju Chang
Dacheng Tao
OffRL
149
0
0
03 May 2025
Fine-Tuning without Performance Degradation
Fine-Tuning without Performance Degradation
Han Wang
Adam White
Martha White
OnRL
166
0
0
01 May 2025
Uncertainty-aware Latent Safety Filters for Avoiding Out-of-Distribution Failures
Uncertainty-aware Latent Safety Filters for Avoiding Out-of-Distribution Failures
Junwon Seo
Kensuke Nakamura
Andrea V. Bajcsy
56
0
0
01 May 2025
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
141
0
0
01 May 2025
Learning Neural Control Barrier Functions from Offline Data with Conservatism
Learning Neural Control Barrier Functions from Offline Data with Conservatism
Ihab Tabbara
Hussein Sibai
OffRL
65
0
0
01 May 2025
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
52
0
0
26 Apr 2025
Offline Learning of Controllable Diverse Behaviors
Offline Learning of Controllable Diverse Behaviors
Mathieu Petitbois
Rémy Portelas
Sylvain Lamprier
Ludovic Denoyer
OffRL
36
0
0
25 Apr 2025
Playing Non-Embedded Card-Based Games with Reinforcement Learning
Playing Non-Embedded Card-Based Games with Reinforcement Learning
Tianyang Wu
Lipeng Wan
Yuhang Wang
Qiang Wan
Xuguang Lan
OffRL
27
0
0
07 Apr 2025
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
Younghwan Lee
Tung M. Luu
Donghoon Lee
Chang D. Yoo
3DV
VLM
OffRL
41
0
0
03 Apr 2025
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Chan Kim
Seung-Woo Seo
Seong-Woo Kim
OODD
163
0
0
21 Mar 2025
SE(3)-Equivariant Robot Learning and Control: A Tutorial Survey
SE(3)-Equivariant Robot Learning and Control: A Tutorial Survey
Joohwan Seo
Soochul Yoo
Junwoo Chang
Hyunseok An
Hyunwoo Ryu
Soomi Lee
Arvind Kruthiventy
Jongeun Choi
R. Horowitz
71
2
0
12 Mar 2025
Mitigating Preference Hacking in Policy Optimization with Pessimism
Dhawal Gupta
Adam Fisch
Christoph Dann
Alekh Agarwal
76
0
0
10 Mar 2025
DPR: Diffusion Preference-based Reward for Offline Reinforcement Learning
DPR: Diffusion Preference-based Reward for Offline Reinforcement Learning
Teng Pang
Bingzheng Wang
Guoqiang Wu
Yilong Yin
OffRL
70
0
0
03 Mar 2025
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Zhanpeng He
Yifeng Cao
M. Ciocarlie
61
0
0
26 Feb 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Lixing Lyu
Jiashuo Jiang
Wang Chi Cheung
42
1
0
24 Feb 2025
Yes, Q-learning Helps Offline In-Context RL
Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov
Alexander Nikulin
Ilya Zisman
Albina Klepach
Andrei Polubarov
Nikita Lyubaykin
Alexander Derevyagin
Igor Kiselev
Vladislav Kurenkov
OffRL
OnRL
175
0
0
24 Feb 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
63
1
0
24 Feb 2025
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
65
23
0
20 Feb 2025
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Xianyuan Zhan
Xiangyu Zhu
Peng Cheng
Xiao Hu
Ziteng He
...
Chenhui Liu
Tianshun Hong
Yan Liang
Yunxin Liu
Feng Zhao
AI4CE
62
0
0
17 Feb 2025
Learning Strategy Representation for Imitation Learning in Multi-Agent Games
Learning Strategy Representation for Imitation Learning in Multi-Agent Games
Shiqi Lei
Kanghon Lee
Linjing Li
Jinkyoo Park
OffRL
42
0
0
17 Feb 2025
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu
Lingfeng Zhao
Shivangi Agarwal
Jinghan Liu
Audrey Huang
P. Amortila
Nan Jiang
OODD
OffRL
101
0
0
11 Feb 2025
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Shenghong He
OffRL
168
0
0
10 Feb 2025
Skill Expansion and Composition in Parameter Space
Skill Expansion and Composition in Parameter Space
Tenglong Liu
J. Li
Yinan Zheng
Haoyi Niu
Yixing Lan
Xin Xu
Xianyuan Zhan
58
4
0
09 Feb 2025
Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning
Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning
Wesley A. Suttle
A. Suresh
Carlos Nieto-Granda
OffRL
95
0
0
06 Feb 2025
Learning from Active Human Involvement through Proxy Value Propagation
Learning from Active Human Involvement through Proxy Value Propagation
Zhenghao Peng
Wenjie Mo
Chenda Duan
Quanyi Li
Bolei Zhou
107
14
0
05 Feb 2025
Dual Alignment Maximin Optimization for Offline Model-based RL
Dual Alignment Maximin Optimization for Offline Model-based RL
Chi Zhou
Wang Luo
Haoran Li
Congying Han
Tiande Guo
Zicheng Zhang
OffRL
71
0
0
02 Feb 2025
B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning
B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning
Woojun Kim
Katia P. Sycara
OffRL
94
0
0
30 Jan 2025
Reinforcement Teaching
Reinforcement Teaching
Alex Lewandowski
Calarina Muslimani
Dale Schuurmans
Matthew E. Taylor
Jun Luo
81
1
0
28 Jan 2025
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Hao Sun
M. Schaar
94
14
0
28 Jan 2025
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Yulong Hu
Tingting Dong
Sen Li
OffRL
OnRL
59
0
0
24 Jan 2025
State Combinatorial Generalization In Decision Making With Conditional Diffusion Models
State Combinatorial Generalization In Decision Making With Conditional Diffusion Models
Xintong Duan
Yutong He
Fahim Tajwar
Wen-Tse Chen
Ruslan Salakhutdinov
Jeff Schneider
OffRL
AI4CE
99
0
0
22 Jan 2025
An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management
An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management
Eslam Eldeeb
Hirley Alves
OffRL
80
0
0
22 Jan 2025
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
Abdullah Akgul
Manuel Haußmann
M. Kandemir
OffRL
71
1
0
17 Jan 2025
Methodology for Interpretable Reinforcement Learning for Optimizing Mechanical Ventilation
Methodology for Interpretable Reinforcement Learning for Optimizing Mechanical Ventilation
Joo Seung Lee
Malini Mahendra
Anil Aswani
OffRL
61
1
0
10 Jan 2025
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Wall Kim
Mamba
57
0
0
10 Jan 2025
Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning
Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning
Tao Liu
Qi Xu
Wei Shi
Zhigang Hua
Shuang Yang
OffRL
43
0
0
09 Jan 2025
SR-Reward: Taking The Path More Traveled
SR-Reward: Taking The Path More Traveled
Seyed Mahdi Basiri Azad
Zahra Padar
Gabriel Kalweit
Joschka Boedecker
OffRL
67
0
0
04 Jan 2025
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
Yooseok Lim
Sujee Lee
OffRL
150
0
0
03 Jan 2025
MADiff: Offline Multi-agent Learning with Diffusion Models
MADiff: Offline Multi-agent Learning with Diffusion Models
Zhengbang Zhu
Minghuan Liu
Liyuan Mao
Bingyi Kang
Minkai Xu
Yong Yu
Stefano Ermon
Weinan Zhang
DiffM
OffRL
88
34
0
03 Jan 2025
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
Fangwei Zhong
Kui Wu
Churan Wang
Hao Chen
Hai Ci
Zhoujun Li
Yizhou Wang
VGen
40
0
0
31 Dec 2024
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
Kun Wu
Yinuo Zhao
Zhihao Xu
Zhengping Che
Chengxiang Yin
C. Liu
Qinru Qiu
Feiferi Feng
OffRL
100
1
0
22 Dec 2024
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations
Cevahir Köprülü
Po-han Li
Tianyu Qiu
Ruihan Zhao
T. Westenbroek
David Fridovich-Keil
Sandeep P. Chinchali
Ufuk Topcu
OffRL
92
0
0
02 Dec 2024
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control
Rohit Bokade
Xiaoning Jin
OffRL
39
0
0
10 Nov 2024
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Marvin Alles
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
36
0
0
07 Nov 2024
Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy for Visuomotor Imitation Learning
Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy for Visuomotor Imitation Learning
George Jiayuan Gao
Tianyu Li
Nadia Figueroa
41
0
0
05 Nov 2024
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Jing Zhang
Linjiajie Fang
Kexin Shi
Wenjia Wang
Bing-Yi Jing
OffRL
36
0
0
27 Oct 2024
12345678
Next