ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.09322
  4. Cited By
Convergent Tree Backup and Retrace with Function Approximation

Convergent Tree Backup and Retrace with Function Approximation

25 May 2017
Ahmed Touati
Pierre-Luc Bacon
Doina Precup
Pascal Vincent
ArXivPDFHTML

Papers citing "Convergent Tree Backup and Retrace with Function Approximation"

12 / 12 papers shown
Title
An Improved Finite-time Analysis of Temporal Difference Learning with
  Deep Neural Networks
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
44
0
0
07 May 2024
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement
  Learning
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
Brett Daley
Martha White
Chris Amato
Marlos C. Machado
OffRL
32
3
0
26 Jan 2023
Importance Sampling Placement in Off-Policy Temporal-Difference Methods
Importance Sampling Placement in Off-Policy Temporal-Difference Methods
Eric Graves
Sina Ghiassian
OffRL
49
2
0
18 Mar 2022
An Empirical Comparison of Off-policy Prediction Learning Algorithms in
  the Four Rooms Environment
An Empirical Comparison of Off-policy Prediction Learning Algorithms in the Four Rooms Environment
Sina Ghiassian
R. Sutton
AAML
OffRL
26
6
0
10 Sep 2021
Convergent and Efficient Deep Q Network Algorithm
Convergent and Efficient Deep Q Network Algorithm
Zhikang T. Wang
Masahito Ueda
38
12
0
29 Jun 2021
An Empirical Comparison of Off-policy Prediction Learning Algorithms on
  the Collision Task
An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task
Sina Ghiassian
R. Sutton
AAML
OffRL
29
5
0
02 Jun 2021
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
  Nonlinear TD Learning
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Shuang Qiu
Zhuoran Yang
Xiaohan Wei
Jieping Ye
Zhaoran Wang
38
38
0
23 Aug 2020
Gradient Q$(σ, λ)$: A Unified Algorithm with Function
  Approximation for Reinforcement Learning
Gradient Q(σ,λ)(σ, λ)(σ,λ): A Unified Algorithm with Function Approximation for Reinforcement Learning
Long Yang
Yu Zhang
Qian Zheng
Pengfei Li
Gang Pan
25
1
0
06 Sep 2019
Modified Actor-Critics
Modified Actor-Critics
Erinc Merdivan
S. Hanke
Matthieu Geist
24
2
0
02 Jul 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global
  Optima
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
47
29
0
24 May 2019
Maximum a Posteriori Policy Optimisation
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
48
471
0
14 Jun 2018
A Finite Time Analysis of Temporal Difference Learning With Linear
  Function Approximation
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj Bhandari
Daniel Russo
Raghav Singal
58
335
0
06 Jun 2018
1