ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.12620
  4. Cited By
A maximum-entropy approach to off-policy evaluation in average-reward
  MDPs

A maximum-entropy approach to off-policy evaluation in average-reward MDPs

17 June 2020
N. Lazić
Dong Yin
Mehrdad Farajtabar
Nir Levine
Dilan Görür
Chris Harris
Dale Schuurmans
    OffRL
ArXiv (abs)PDFHTML

Papers citing "A maximum-entropy approach to off-policy evaluation in average-reward MDPs"

9 / 9 papers shown
Title
Imitation Learning in Discounted Linear MDPs without exploration
  assumptions
Imitation Learning in Discounted Linear MDPs without exploration assumptionsInternational Conference on Machine Learning (ICML), 2024
Luca Viano
Stratis Skoulakis
Volkan Cevher
227
7
0
03 May 2024
What can online reinforcement learning with function approximation
  benefit from general coverage conditions?
What can online reinforcement learning with function approximation benefit from general coverage conditions?International Conference on Machine Learning (ICML), 2023
Fanghui Liu
Luca Viano
Volkan Cevher
OffRL
223
4
0
25 Apr 2023
Proximal Point Imitation Learning
Proximal Point Imitation LearningNeural Information Processing Systems (NeurIPS), 2022
Luca Viano
Angeliki Kamoutsi
Gergely Neu
Igor Krawczuk
Volkan Cevher
432
20
0
22 Sep 2022
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CMLFAttOffRL
148
2
0
06 Oct 2021
Infinite-Horizon Offline Reinforcement Learning with Linear Function
  Approximation: Curse of Dimensionality and Algorithm
Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm
Lin Chen
B. Scherrer
Peter L. Bartlett
OffRL
304
16
0
17 Mar 2021
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and
  Dual Bounds
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual BoundsInternational Conference on Learning Representations (ICLR), 2021
Yihao Feng
Ziyang Tang
Na Zhang
Qiang Liu
OffRL
182
14
0
09 Mar 2021
Average-Reward Off-Policy Policy Evaluation with Function Approximation
Average-Reward Off-Policy Policy Evaluation with Function ApproximationInternational Conference on Machine Learning (ICML), 2021
Shangtong Zhang
Yi Wan
R. Sutton
Shimon Whiteson
OffRL
265
35
0
08 Jan 2021
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample
  Efficient
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
Botao Hao
Yaqi Duan
Tor Lattimore
Csaba Szepesvári
Mengdi Wang
OffRL
289
28
0
08 Nov 2020
Online Sparse Reinforcement Learning
Online Sparse Reinforcement Learning
Botao Hao
Tor Lattimore
Csaba Szepesvári
Mengdi Wang
OffRL
591
30
0
08 Nov 2020
1