ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
  • Feedback
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.20312
  4. Cited By
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
v1v2 (latest)

Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model

27 October 2024
Jing Zhang
Linjiajie Fang
Kexin Shi
Wenjia Wang
Bing-Yi Jing
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model"

50 / 57 papers shown
Title
Fighting Uncertainty with Gradients: Offline Reinforcement Learning via
  Diffusion Score Matching
Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching
H.J. Terry Suh
Glen Chou
Hongkai Dai
Lujie Yang
Abhishek Gupta
Russ Tedrake
DiffMOffRL
136
10
0
24 Jun 2023
Efficient Diffusion Policies for Offline Reinforcement Learning
Efficient Diffusion Policies for Offline Reinforcement Learning
Bingyi Kang
Xiao Ma
Chao Du
Tianyu Pang
Shuicheng Yan
OffRL
208
91
0
31 May 2023
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion
  Policies
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies
Philippe Hansen-Estruch
Ilya Kostrikov
Michael Janner
J. Kuba
Sergey Levine
OffRL
202
176
0
20 Apr 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value
  Regularization
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
130
89
0
28 Mar 2023
Uncertainty-Aware Instance Reweighting for Off-Policy Learning
Uncertainty-Aware Instance Reweighting for Off-Policy Learning
Xiaoying Zhang
Junpu Chen
Hongning Wang
Hong Xie
Yang Liu
John C. S. Lui
Hang Li
OffRL
176
4
0
11 Mar 2023
Consistency Models
Consistency Models
Yang Song
Prafulla Dhariwal
Mark Chen
Ilya Sutskever
VLMDiffM
213
1,130
0
02 Mar 2023
The In-Sample Softmax for Offline Reinforcement Learning
The In-Sample Softmax for Offline Reinforcement Learning
Chenjun Xiao
Han Wang
Yangchen Pan
Adam White
Martha White
OffRL
93
27
0
28 Feb 2023
Conservative State Value Estimation for Offline Reinforcement Learning
Conservative State Value Estimation for Offline Reinforcement Learning
Liting Chen
Jie Yan
Zhengdao Shao
Lu Wang
Qingwei Lin
Saravan Rajmohan
Thomas Moscibroda
Dongmei Zhang
OffRL
94
7
0
14 Feb 2023
Constrained Policy Optimization with Explicit Behavior Density for
  Offline Reinforcement Learning
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
Jing Zhang
Chi Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
126
10
0
28 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg
Joey Hejna
Matthieu Geist
Stefano Ermon
OffRL
141
85
0
05 Jan 2023
Optimal Conservative Offline RL with General Function Approximation via
  Augmented Lagrangian
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian
Paria Rashidinejad
Hanlin Zhu
Kunhe Yang
Stuart J. Russell
Jiantao Jiao
OffRL
204
31
0
01 Nov 2022
Offline Reinforcement Learning via High-Fidelity Generative Behavior
  Modeling
Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling
Huayu Chen
Cheng Lu
Chengyang Ying
Hang Su
Jun Zhu
DiffMOffRL
260
134
0
29 Sep 2022
Diffusion Policies as an Expressive Policy Class for Offline
  Reinforcement Learning
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning
Zhendong Wang
Jonathan J. Hunt
Mingyuan Zhou
OffRL
224
422
0
12 Aug 2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Jiafei Lyu
Xiaoteng Ma
Xiu Li
Zongqing Lu
OffRL
175
118
0
09 Jun 2022
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in
  Offline RL
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL
Wonjoon Goo
S. Niekum
OffRL
114
21
0
01 Jun 2022
Elucidating the Design Space of Diffusion-Based Generative Models
Elucidating the Design Space of Diffusion-Based Generative Models
Tero Karras
M. Aittala
Timo Aila
S. Laine
DiffM
603
2,274
0
01 Jun 2022
Planning with Diffusion for Flexible Behavior Synthesis
Planning with Diffusion for Flexible Behavior Synthesis
Michael Janner
Yilun Du
J. Tenenbaum
Sergey Levine
DiffM
475
783
0
20 May 2022
Latent-Variable Advantage-Weighted Policy Optimization for Offline RL
Latent-Variable Advantage-Weighted Policy Optimization for Offline RL
Xi Chen
Ali Ghadirzadeh
Tianhe Yu
Yuan Gao
Jianhao Wang
Wenzhe Li
Bin Liang
Chelsea Finn
Chongjie Zhang
OffRL
111
14
0
16 Mar 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement
  Learning
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning
Chenjia Bai
Lingxiao Wang
Zhuoran Yang
Zhihong Deng
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
126
142
0
23 Feb 2022
Supported Policy Optimization for Offline Reinforcement Learning
Supported Policy Optimization for Offline Reinforcement Learning
Jialong Wu
Haixu Wu
Zihan Qiu
Jianmin Wang
Mingsheng Long
OffRL
123
75
0
13 Feb 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
372
1,014
0
12 Oct 2021
Uncertainty-Based Offline Reinforcement Learning with Diversified
  Q-Ensemble
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Gaon An
Seungyong Moon
Jang-Hyun Kim
Hyun Oh Song
OffRL
246
299
0
04 Oct 2021
Offline RL Without Off-Policy Evaluation
Offline RL Without Off-Policy Evaluation
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
168
174
0
16 Jun 2021
A Minimalist Approach to Offline Reinforcement Learning
A Minimalist Approach to Offline Reinforcement Learning
Scott Fujimoto
S. Gu
OffRL
213
878
0
12 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
337
1,781
0
02 Jun 2021
Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Yue Wu
Shuangfei Zhai
Nitish Srivastava
J. Susskind
Jian Zhang
Ruslan Salakhutdinov
Hanlin Goh
EDLOffRLOnRL
137
196
0
17 May 2021
Risk-Averse Offline Reinforcement Learning
Risk-Averse Offline Reinforcement Learning
Núria Armengol Urpí
Sebastian Curi
Andreas Krause
OffRL
95
73
0
10 Feb 2021
Score-Based Generative Modeling through Stochastic Differential
  Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffMSyDa
660
7,381
0
26 Nov 2020
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
Avi Singh
Huihan Liu
G. Zhou
Albert Yu
Nicholas Rhinehart
Sergey Levine
OffRLOnRL
145
150
0
19 Nov 2020
PLAS: Latent Action Space for Offline Reinforcement Learning
PLAS: Latent Action Space for Offline Reinforcement Learning
Wenxuan Zhou
Sujay Bajracharya
David Held
OffRL
128
166
0
14 Nov 2020
Planning with Learned Dynamics: Probabilistic Guarantees on Safety and
  Reachability via Lipschitz Constants
Planning with Learned Dynamics: Probabilistic Guarantees on Safety and Reachability via Lipschitz Constants
Craig Knuth
Glen Chou
N. Ozay
Dmitry Berenson
125
36
0
18 Oct 2020
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline
  and Online RL
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour
Dale Schuurmans
S. Gu
OffRL
338
126
0
21 Jul 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
2.0K
21,059
0
19 Jun 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRLOnRL
325
644
0
16 Jun 2020
Improved Techniques for Training Score-Based Generative Models
Improved Techniques for Training Score-Based Generative Models
Yang Song
Stefano Ermon
DiffM
386
1,233
0
16 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRLOnRL
204
1,958
0
08 Jun 2020
Controlling Overestimation Bias with Truncated Mixture of Continuous
  Distributional Quantile Critics
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
Arsenii Kuznetsov
Pavel Shvechikov
Alexander Grishin
Dmitry Vetrov
285
220
0
08 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRLGP
665
2,138
0
04 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GPOffRL
386
1,449
0
15 Apr 2020
Behavior Regularized Offline Reinforcement Learning
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
289
703
0
26 Nov 2019
Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping
Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping
Cristian Bodnar
A. Li
Karol Hausman
P. Pastor
Mrinal Kalakrishnan
OffRL
111
54
0
01 Oct 2019
Advantage-Weighted Regression: Simple and Scalable Off-Policy
  Reinforcement Learning
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
OffRL
278
604
0
01 Oct 2019
Generative Modeling by Estimating Gradients of the Data Distribution
Generative Modeling by Estimating Gradients of the Data Distribution
Yang Song
Stefano Ermon
SyDaDiffM
487
4,266
0
12 Jul 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
223
349
0
30 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
223
1,095
0
03 Jun 2019
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRLBDL
500
1,693
0
07 Dec 2018
Distributed Distributional Deterministic Policy Gradients
Distributed Distributional Deterministic Policy Gradients
Gabriel Barth-Maron
Matthew W. Hoffman
David Budden
Will Dabney
Dan Horgan
TB Dhruva
Alistair Muldal
N. Heess
Timothy Lillicrap
OffRL
191
487
0
23 Apr 2018
Addressing Function Approximation Error in Actor-Critic Methods
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
546
5,520
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
660
8,974
0
04 Jan 2018
The Uncertainty Bellman Equation and Exploration
The Uncertainty Bellman Equation and Exploration
Brendan O'Donoghue
Ian Osband
Rémi Munos
Volodymyr Mnih
176
199
0
15 Sep 2017
12
Next