ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.00137
  4. Cited By
A Theoretical Analysis of Deep Q-Learning

A Theoretical Analysis of Deep Q-Learning

1 January 2019
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
ArXivPDFHTML

Papers citing "A Theoretical Analysis of Deep Q-Learning"

50 / 79 papers shown
Title
Embodied Intelligence: The Key to Unblocking Generalized Artificial Intelligence
Embodied Intelligence: The Key to Unblocking Generalized Artificial Intelligence
Jinhao Jiang
Changlin Chen
Shile Feng
Wanru Geng
Zesheng Zhou
Ni Wang
Shuai Li
Feng-Qi Cui
Erbao Dong
AI4CE
31
0
0
11 May 2025
Universal Approximation Theorem of Deep Q-Networks
Universal Approximation Theorem of Deep Q-Networks
Qian Qi
37
1
0
04 May 2025
Approximation to Deep Q-Network by Stochastic Delay Differential Equations
Approximation to Deep Q-Network by Stochastic Delay Differential Equations
Jianya Lu
Yingjun Mo
33
0
0
01 May 2025
Low-altitude Friendly-Jamming for Satellite-Maritime Communications via Generative AI-enabled Deep Reinforcement Learning
Jiawei Huang
Aimin Wang
Geng Sun
Jiahui Li
Jiacheng Wang
Dusit Niyato
Victor C. M. Leung
57
0
0
28 Jan 2025
Game Theory and Multi-Agent Reinforcement Learning : From Nash Equilibria to Evolutionary Dynamics
Game Theory and Multi-Agent Reinforcement Learning : From Nash Equilibria to Evolutionary Dynamics
Neil De La Fuente
Miquel Noguer i Alonso
Guim Casadellà
36
0
0
31 Dec 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
49
3
0
24 Oct 2024
Process Reward Model with Q-Value Rankings
Process Reward Model with Q-Value Rankings
W. Li
Yixuan Li
LRM
59
15
0
15 Oct 2024
Deflated Dynamics Value Iteration
Deflated Dynamics Value Iteration
Jongmin Lee
Amin Rakhsha
Ernest K. Ryu
Amir-massoud Farahmand
40
2
0
15 Jul 2024
Simplifying Deep Temporal Difference Learning
Simplifying Deep Temporal Difference Learning
Matteo Gallici
Mattie Fellows
Benjamin Ellis
B. Pou
Ivan Masmitja
Jakob Foerster
Mario Martin
OffRL
62
15
0
05 Jul 2024
An Improved Finite-time Analysis of Temporal Difference Learning with
  Deep Neural Networks
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
37
0
0
07 May 2024
Federated Offline Reinforcement Learning: Collaborative Single-Policy
  Coverage Suffices
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo
Laixi Shi
Gauri Joshi
Yuejie Chi
OffRL
29
3
0
08 Feb 2024
Regularized Q-Learning with Linear Function Approximation
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
38
2
0
26 Jan 2024
BET: Explaining Deep Reinforcement Learning through The Error-Prone
  Decisions
BET: Explaining Deep Reinforcement Learning through The Error-Prone Decisions
Xiao Liu
Jie Zhao
Wubing Chen
Mao Tan
Yongxin Su
OffRL
FAtt
33
0
0
14 Jan 2024
Neural Network Approximation for Pessimistic Offline Reinforcement
  Learning
Neural Network Approximation for Pessimistic Offline Reinforcement Learning
Di Wu
Yuling Jiao
Li Shen
Haizhao Yang
Xiliang Lu
OffRL
29
1
0
19 Dec 2023
An Invitation to Deep Reinforcement Learning
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRL
OOD
78
5
0
13 Dec 2023
Fitted Value Iteration Methods for Bicausal Optimal Transport
Fitted Value Iteration Methods for Bicausal Optimal Transport
Erhan Bayraktar
Bingyan Han
OT
29
6
0
22 Jun 2023
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum
  Markov Games: Switching System Approach
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach
Dong-hwan Lee
21
2
0
09 Jun 2023
High-probability sample complexities for policy evaluation with linear
  function approximation
High-probability sample complexities for policy evaluation with linear function approximation
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
30
6
0
30 May 2023
An Offline Time-aware Apprenticeship Learning Framework for Evolving
  Reward Functions
An Offline Time-aware Apprenticeship Learning Framework for Evolving Reward Functions
Xi Yang
Ge Gao
Min Chi
OffRL
29
2
0
15 May 2023
Local Optimization Achieves Global Optimality in Multi-Agent
  Reinforcement Learning
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Yulai Zhao
Zhuoran Yang
Zhaoran Wang
Jason D. Lee
43
3
0
08 May 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function
  Approximation
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
Why Target Networks Stabilise Temporal Difference Methods
Why Target Networks Stabilise Temporal Difference Methods
Matt Fellows
Matthew Smith
Shimon Whiteson
OOD
AAML
21
7
0
24 Feb 2023
Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes
Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes
Di Wang
Yao Wang
Shaojie Tang
OffRL
21
1
0
21 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to
  Nonlinear
Reinforcement Learning with Function Approximation: From Linear to Nonlinear
Jihao Long
Jiequn Han
27
5
0
20 Feb 2023
Distillation Policy Optimization
Distillation Policy Optimization
Jianfei Ma
OffRL
26
1
0
01 Feb 2023
Operator Splitting Value Iteration
Operator Splitting Value Iteration
Amin Rakhsha
Andrew Wang
Mohammad Ghavamzadeh
Amir-massoud Farahmand
OffRL
33
7
0
25 Nov 2022
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Andrea Zanette
OffRL
18
14
0
10 Nov 2022
Can maker-taker fees prevent algorithmic cooperation in market making?
Can maker-taker fees prevent algorithmic cooperation in market making?
Bingyan Han
40
1
0
01 Nov 2022
Strategic Decision-Making in the Presence of Information Asymmetry:
  Provably Efficient RL with Algorithmic Instruments
Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments
Mengxin Yu
Zhuoran Yang
Jianqing Fan
OffRL
21
8
0
23 Aug 2022
Robust Knowledge Adaptation for Dynamic Graph Neural Networks
Robust Knowledge Adaptation for Dynamic Graph Neural Networks
Han Li
Changsheng Li
Kaituo Feng
Ye Yuan
Guoren Wang
H. Zha
34
13
0
22 Jul 2022
q-Learning in Continuous Time
q-Learning in Continuous Time
Yanwei Jia
X. Zhou
OffRL
51
68
0
02 Jul 2022
Analysis of Stochastic Processes through Replay Buffers
Analysis of Stochastic Processes through Replay Buffers
Shirli Di-Castro Shashua
Shie Mannor
Dotan Di-Castro
36
6
0
26 Jun 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient
  Learning
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
38
5
0
01 Jun 2022
CoNSoLe: Convex Neural Symbolic Learning
CoNSoLe: Convex Neural Symbolic Learning
Haoran Li
Yang Weng
Hanghang Tong
27
9
0
01 Jun 2022
Pervasive Machine Learning for Smart Radio Environments Enabled by
  Reconfigurable Intelligent Surfaces
Pervasive Machine Learning for Smart Radio Environments Enabled by Reconfigurable Intelligent Surfaces
G. C. Alexandropoulos
Kyriakos Stylianopoulos
Chongwen Huang
Chau Yuen
M. Bennis
Mérouane Debbah
25
87
0
08 May 2022
Chemoreception and chemotaxis of a three-sphere swimmer
Chemoreception and chemotaxis of a three-sphere swimmer
S. Paz
R. Ausas
J. P. Carbajal
G. Buscaglia
13
4
0
05 May 2022
Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement
  Learning
Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning
Jingqi Li
Donggun Lee
Somayeh Sojoudi
Claire Tomlin
13
11
0
18 Mar 2022
Target Network and Truncation Overcome The Deadly Triad in $Q$-Learning
Target Network and Truncation Overcome The Deadly Triad in QQQ-Learning
Zaiwei Chen
John-Paul Clarke
S. T. Maguluri
18
19
0
05 Mar 2022
Testing Stationarity and Change Point Detection in Reinforcement Learning
Testing Stationarity and Change Point Detection in Reinforcement Learning
Mengbing Li
C. Shi
Zhanghua Wu
Piotr Fryzlewicz
OffRL
42
9
0
03 Mar 2022
Off-Policy Confidence Interval Estimation with Confounded Markov
  Decision Process
Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process
C. Shi
Jin Zhu
Ye Shen
Shuang Luo
Hong Zhu
R. Song
OffRL
28
30
0
22 Feb 2022
Understanding Value Decomposition Algorithms in Deep Cooperative
  Multi-Agent Reinforcement Learning
Understanding Value Decomposition Algorithms in Deep Cooperative Multi-Agent Reinforcement Learning
Zehao Dou
J. Kuba
Yaodong Yang
FAtt
22
5
0
10 Feb 2022
Deep Q-learning: a robust control approach
Deep Q-learning: a robust control approach
B. Varga
Balázs Kulcsár
M. Chehreghani
OOD
30
9
0
21 Jan 2022
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in
  General-Sum Markov Games with Myopic Followers?
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?
Han Zhong
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
29
30
0
27 Dec 2021
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
27
167
0
08 Dec 2021
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
27
4
0
29 Nov 2021
The Impact of Data Distribution on Q-learning with Function
  Approximation
The Impact of Data Distribution on Q-learning with Function Approximation
Pedro P. Santos
Diogo S. Carvalho
A. Sardinha
Francisco S. Melo
OffRL
19
2
0
23 Nov 2021
Perturbational Complexity by Distribution Mismatch: A Systematic
  Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space
Perturbational Complexity by Distribution Mismatch: A Systematic Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space
Jihao Long
Jiequn Han
29
6
0
05 Nov 2021
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via
  pT-Learning
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning
Wenzhuo Zhou
Ruoqing Zhu
Annie Qu
32
22
0
20 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
45
50
0
09 Oct 2021
Optimal policy evaluation using kernel-based temporal difference methods
Optimal policy evaluation using kernel-based temporal difference methods
Yaqi Duan
Mengdi Wang
Martin J. Wainwright
OffRL
28
26
0
24 Sep 2021
12
Next