ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1903.08894
  4. Cited By
Towards Characterizing Divergence in Deep Q-Learning

Towards Characterizing Divergence in Deep Q-Learning

21 March 2019
Joshua Achiam
Ethan Knight
Pieter Abbeel
ArXivPDFHTML

Papers citing "Towards Characterizing Divergence in Deep Q-Learning"

24 / 24 papers shown
Title
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
Kun Wu
Yinuo Zhao
Zhihao Xu
Zhengping Che
Chengxiang Yin
C. Liu
Qinru Qiu
Feiferi Feng
OffRL
100
1
0
22 Dec 2024
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Augustine N. Mavor-Parker
Matthew J. Sargent
Caswell Barry
Lewis D. Griffin
Clare Lyle
47
2
0
09 Jul 2024
Contrastive Representation for Data Filtering in Cross-Domain Offline
  Reinforcement Learning
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Xiaoyu Wen
Chenjia Bai
Kang Xu
Xudong Yu
Yang Zhang
Xuelong Li
Zhen Wang
41
2
0
10 May 2024
The Ladder in Chaos: A Simple and Effective Improvement to General DRL
  Algorithms by Policy Path Trimming and Boosting
The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Hongyao Tang
M. Zhang
Jianye Hao
23
1
0
02 Mar 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function
  Approximation
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
21
0
0
25 Feb 2023
Deep Reinforcement Learning for IRS Phase Shift Design in
  Spatiotemporally Correlated Environments
Deep Reinforcement Learning for IRS Phase Shift Design in Spatiotemporally Correlated Environments
Spilios Evmorfos
Athina P. Petropulu
H. Vincent Poor
OOD
18
3
0
02 Nov 2022
Bridging the Gap Between Target Networks and Functional Regularization
Alexandre Piché
Valentin Thomas
Joseph Marino
Rafael Pardiñas
Gian Maria Marconi
C. Pal
Mohammad Emtiyaz Khan
14
1
0
21 Oct 2022
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Benjamin Eysenbach
Tianjun Zhang
Ruslan Salakhutdinov
Sergey Levine
SSL
OffRL
25
139
0
15 Jun 2022
Overcoming the Spectral Bias of Neural Value Approximation
Overcoming the Spectral Bias of Neural Value Approximation
Ge Yang
Anurag Ajay
Pulkit Agrawal
32
25
0
09 Jun 2022
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement
  for Value Error
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
Scott Fujimoto
D. Meger
Doina Precup
Ofir Nachum
S. Gu
30
32
0
28 Jan 2022
Deep Q-learning: a robust control approach
Deep Q-learning: a robust control approach
B. Varga
Balázs Kulcsár
M. Chehreghani
OOD
22
9
0
21 Jan 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit
  Regularization
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Aviral Kumar
Rishabh Agarwal
Tengyu Ma
Aaron Courville
George Tucker
Sergey Levine
OffRL
31
65
0
09 Dec 2021
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with
  Dual Coordination Mechanism
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism
Zhiwei Xu
Yunpeng Bai
Bin Zhang
Dapeng Li
Guoliang Fan
22
23
0
14 Oct 2021
On the Estimation Bias in Double Q-Learning
On the Estimation Bias in Double Q-Learning
Zhizhou Ren
Guangxiang Zhu
Haotian Hu
Beining Han
Jian-Hai Chen
Chongjie Zhang
16
17
0
29 Sep 2021
Convergent and Efficient Deep Q Network Algorithm
Convergent and Efficient Deep Q Network Algorithm
Zhikang T. Wang
Masahito Ueda
14
12
0
29 Jun 2021
Modularity in Reinforcement Learning via Algorithmic Independence in
  Credit Assignment
Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment
Michael Chang
Sid Kaushik
Sergey Levine
Thomas L. Griffiths
28
8
0
28 Jun 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD)
C. Bowyer
14
1
0
15 Apr 2021
Regularized Behavior Value Estimation
Regularized Behavior Value Estimation
Çağlar Gülçehre
Sergio Gomez Colmenarejo
Ziyun Wang
Jakub Sygnowski
T. Paine
Konrad Zolna
Yutian Chen
Matthew W. Hoffman
Razvan Pascanu
Nando de Freitas
OffRL
23
37
0
17 Mar 2021
Softmax Deep Double Deterministic Policy Gradients
Softmax Deep Double Deterministic Policy Gradients
Ling Pan
Qingpeng Cai
Longbo Huang
72
86
0
19 Oct 2020
GRAC: Self-Guided and Self-Regularized Actor-Critic
GRAC: Self-Guided and Self-Regularized Actor-Critic
Lin Shao
Yifan You
Mengyuan Yan
Qingyun Sun
Jeannette Bohg
16
23
0
18 Sep 2020
Real-world Video Adaptation with Reinforcement Learning
Real-world Video Adaptation with Reinforcement Learning
Hongzi Mao
Shannon Chen
Drew Dimmery
Shaun Singh
Drew Blaisdell
Yuandong Tian
Mohammad Alizadeh
E. Bakshy
OffRL
8
76
0
28 Aug 2020
Learning Off-Policy with Online Planning
Learning Off-Policy with Online Planning
Harshit S. Sikchi
Wenxuan Zhou
David Held
OffRL
31
45
0
23 Aug 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
50
134
0
25 Jun 2020
A Survey and Critique of Multiagent Deep Reinforcement Learning
A Survey and Critique of Multiagent Deep Reinforcement Learning
Pablo Hernandez-Leal
Bilal Kartal
Matthew E. Taylor
OffRL
32
550
0
12 Oct 2018
1