Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2101.02808
Cited By
v1
v2
v3 (latest)
Average-Reward Off-Policy Policy Evaluation with Function Approximation
International Conference on Machine Learning (ICML), 2021
8 January 2021
Shangtong Zhang
Yi Wan
R. Sutton
Shimon Whiteson
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Average-Reward Off-Policy Policy Evaluation with Function Approximation"
23 / 23 papers shown
Hardware-Software Collaborative Computing of Photonic Spiking Reinforcement Learning for Robotic Continuous Control
Mengting Yu
Shuiying Xiang
Changjian Xie
Yonghang Chen
Haowen Zhao
Xingxing Guo
Yahui Zhang
Yanan Han
Yue Hao
82
0
0
29 Nov 2025
Towards Formalizing Reinforcement Learning Theory
Shangtong Zhang
120
0
0
05 Nov 2025
Non-iid hypothesis testing: from classical to quantum
Giacomo De Palma
Marco Fanizza
Connor Mowry
Ryan O'Donnell
96
0
0
07 Oct 2025
Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
Zixuan Xie
Xinyu Liu
Rohan Chandra
Shangtong Zhang
373
1
0
27 May 2025
Towards Optimal Offline Reinforcement Learning
Mengmeng Li
Daniel Kuhn
Tobias Sutter
OffRL
319
1
0
15 Mar 2025
Average Reward Reinforcement Learning for Wireless Radio Resource Management
Asilomar Conference on Signals, Systems and Computers (ACSSC), 2024
Kun Yang
Jing Yang
Cong Shen
219
2
0
12 Jan 2025
Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes
Juan Sebastian Rojas
Chi-Guhn Lee
279
2
0
14 Oct 2024
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
International Conference on Machine Learning (ICML), 2024
Yukinari Hisaki
Isao Ono
166
4
0
04 Aug 2024
e-COP : Episodic Constrained Optimization of Policies
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Sahil Singla
OffRL
232
1
0
13 Jun 2024
Transformable Gaussian Reward Function for Socially-Aware Navigation with Deep Reinforcement Learning
Jinyeob Kim
Sumin Kang
Sungwoo Yang
Beomjoon Kim
Jargalbaatar Yura
Donghan Kim
803
2
0
22 Feb 2024
Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation
Neural Information Processing Systems (NeurIPS), 2023
Efstathia Soufleri
Jian Li
237
17
0
03 Oct 2023
Infer and Adapt: Bipedal Locomotion Reward Learning from Demonstrations via Inverse Reinforcement Learning
IEEE International Conference on Robotics and Automation (ICRA), 2023
Chao Liu
Zhaoyuan Gu
Hanran Wu
Deniz Irem Erus
Ye Zhao
230
10
0
28 Sep 2023
A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using
L
L
L
-
λ
λ
λ
Smoothness
Hengshuai Yao
285
3
0
29 Jul 2023
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
International Conference on Machine Learning (ICML), 2023
Brahma S. Pavse
M. Zurek
Yudong Chen
Qiaomin Xie
Josiah P. Hanna
OffRL
359
2
0
02 Jun 2023
Model-Free Robust Average-Reward Reinforcement Learning
International Conference on Machine Learning (ICML), 2023
Yue Wang
Alvaro Velasquez
George Atia
Ashley Prater-Bennette
Shaofeng Zou
210
22
0
17 May 2023
Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms
Neural Information Processing Systems (NeurIPS), 2023
Yashaswini Murthy
Mehrdad Moharrami
R. Srikant
OffRL
241
7
0
02 Feb 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
International Conference on Machine Learning (ICML), 2023
Akhil Agnihotri
R. Jain
Haipeng Luo
560
2
0
02 Feb 2023
Markovian Interference in Experiments
Neural Information Processing Systems (NeurIPS), 2022
Vivek F. Farias
Andrew A. Li
Tianyi Peng
Andrew Zheng
OffRL
163
41
0
06 Jun 2022
Stochastic first-order methods for average-reward Markov decision processes
Mathematics of Operations Research (MOR), 2022
Tianjiao Li
Feiyang Wu
Guanghui Lan
493
23
0
11 May 2022
Average-Reward Learning and Planning with Options
Yi Wan
A. Naik
R. Sutton
90
10
0
26 Oct 2021
Average-Reward Reinforcement Learning with Trust Region Methods
International Joint Conference on Artificial Intelligence (IJCAI), 2021
Xiaoteng Ma
Xiao-Jing Tang
Li Xia
Jun Yang
Qianchuan Zhao
207
23
0
07 Jun 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Neural Information Processing Systems (NeurIPS), 2021
Ming Yin
Yu Wang
OffRL
277
19
0
13 May 2021
Breaking the Deadly Triad with a Target Network
International Conference on Machine Learning (ICML), 2021
Shangtong Zhang
Hengshuai Yao
Shimon Whiteson
AAML
734
56
0
21 Jan 2021
1