ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07471
  4. Cited By
Weak Convergence Properties of Constrained Emphatic Temporal-difference
  Learning with Constant and Slowly Diminishing Stepsize
v1v2v3 (latest)

Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize

23 November 2015
Huizhen Yu
ArXiv (abs)PDFHTML

Papers citing "Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize"

11 / 11 papers shown
Title
Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
Zixuan Xie
Xinyu Liu
Rohan Chandra
Shangtong Zhang
47
0
0
27 May 2025
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via
  pT-Learning
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning
Wenzhuo Zhou
Ruoqing Zhu
Annie Qu
79
22
0
20 Oct 2021
A Study of Policy Gradient on a Class of Exactly Solvable Models
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Prakash Panangaden
Doina Precup
36
0
0
03 Nov 2020
Distributed Value Function Approximation for Collaborative Multi-Agent
  Reinforcement Learning
Distributed Value Function Approximation for Collaborative Multi-Agent Reinforcement Learning
M. Stanković
M. Beko
S. Stankovic
OffRL
10
16
0
18 Jun 2020
Convergence of Recursive Stochastic Algorithms using Wasserstein
  Divergence
Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence
Abhishek Gupta
W. Haskell
13
5
0
25 Mar 2020
On Convergence of some Gradient-based Temporal-Differences Algorithms
  for Off-Policy Learning
On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning
Huizhen Yu
OffRL
99
32
0
27 Dec 2017
On Generalized Bellman Equations and Temporal-Difference Learning
On Generalized Bellman Equations and Temporal-Difference Learning
Huizhen Yu
A. R. Mahmood
R. Sutton
118
29
0
14 Apr 2017
Multi-step Off-policy Learning Without Importance Sampling Ratios
Multi-step Off-policy Learning Without Importance Sampling Ratios
A. R. Mahmood
Huizhen Yu
R. Sutton
OffRL
143
54
0
09 Feb 2017
Some Simulation Results for Emphatic Temporal-Difference Learning
  Algorithms
Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms
Huizhen Yu
44
2
0
06 May 2016
Two Timescale Stochastic Approximation with Controlled Markov noise and
  Off-policy temporal difference learning
Two Timescale Stochastic Approximation with Controlled Markov noise and Off-policy temporal difference learning
Prasenjit Karmakar
S. Bhatnagar
89
27
0
31 Mar 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference
  Learning
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
103
272
0
14 Mar 2015
1