Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.07471
Cited By
v1
v2
v3 (latest)
Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize
23 November 2015
Huizhen Yu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize"
11 / 11 papers shown
Title
Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
Zixuan Xie
Xinyu Liu
Rohan Chandra
Shangtong Zhang
47
0
0
27 May 2025
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning
Wenzhuo Zhou
Ruoqing Zhu
Annie Qu
79
22
0
20 Oct 2021
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Prakash Panangaden
Doina Precup
36
0
0
03 Nov 2020
Distributed Value Function Approximation for Collaborative Multi-Agent Reinforcement Learning
M. Stanković
M. Beko
S. Stankovic
OffRL
10
16
0
18 Jun 2020
Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence
Abhishek Gupta
W. Haskell
13
5
0
25 Mar 2020
On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning
Huizhen Yu
OffRL
99
32
0
27 Dec 2017
On Generalized Bellman Equations and Temporal-Difference Learning
Huizhen Yu
A. R. Mahmood
R. Sutton
118
29
0
14 Apr 2017
Multi-step Off-policy Learning Without Importance Sampling Ratios
A. R. Mahmood
Huizhen Yu
R. Sutton
OffRL
143
54
0
09 Feb 2017
Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms
Huizhen Yu
44
2
0
06 May 2016
Two Timescale Stochastic Approximation with Controlled Markov noise and Off-policy temporal difference learning
Prasenjit Karmakar
S. Bhatnagar
89
27
0
31 Mar 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
103
272
0
14 Mar 2015
1