ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.03531
  4. Cited By
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds
  Revisited

Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited

7 October 2020
O. D. Domingues
Pierre Ménard
E. Kaufmann
Michal Valko
ArXiv (abs)PDFHTML

Papers citing "Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited"

50 / 84 papers shown
Title
Online Learning of Optimal Sequential Testing Policies
Online Learning of Optimal Sequential Testing Policies
Qiyuan Chen
Raed Al Kontar
OffRL
92
0
0
03 Sep 2025
ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning
ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning
Debamita Ghosh
George Atia
Yue Wang
OffRLOOD
183
3
0
05 Aug 2025
Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties
Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties
Zain Ulabedeen Farhat
Debamita Ghosh
George Atia
Yue Wang
106
1
0
04 Aug 2025
Statistical and Algorithmic Foundations of Reinforcement Learning
Statistical and Algorithmic Foundations of Reinforcement Learning
Yuejie Chi
Yuxin Chen
Yuting Wei
OffRL
161
2
0
19 Jul 2025
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
Jiachen Hu
Rui Ai
Han Zhong
Xiaoyu Chen
L. Wang
Zhaoran Wang
Zhuoran Yang
165
0
0
11 Jun 2025
When a Reinforcement Learning Agent Encounters Unknown Unknowns
When a Reinforcement Learning Agent Encounters Unknown Unknowns
Juntian Zhu
Miguel de Carvalho
Zhouwang Yang
Fengxiang He
227
0
0
19 May 2025
TW-CRL: Time-Weighted Contrastive Reward Learning for Efficient Inverse Reinforcement Learning
TW-CRL: Time-Weighted Contrastive Reward Learning for Efficient Inverse Reinforcement Learning
Yuxuan Li
Yicheng Gao
Ning Yang
Stephen Xia
OffRL
294
0
0
08 Apr 2025
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Minimax Optimal Reinforcement Learning with Quasi-OptimismInternational Conference on Learning Representations (ICLR), 2025
Harin Lee
Min-hwan Oh
OffRL
251
1
0
02 Mar 2025
A Refined Analysis of UCBVI
A Refined Analysis of UCBVI
Simone Drago
Marco Mussi
Alberto Maria Metelli
287
0
0
24 Feb 2025
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from
  Shifted-Dynamics Data
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics DataInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRLOnRL
217
4
0
06 Nov 2024
Learning in Markov Games with Adaptive Adversaries: Policy Regret,
  Fundamental Barriers, and Efficient Algorithms
Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient AlgorithmsNeural Information Processing Systems (NeurIPS), 2024
Thanh Nguyen-Tang
Raman Arora
262
1
0
01 Nov 2024
Can we hop in general? A discussion of benchmark selection and design
  using the Hopper environment
Can we hop in general? A discussion of benchmark selection and design using the Hopper environment
C. Voelcker
Marcel Hussing
Eric Eaton
OffRL
244
6
0
11 Oct 2024
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound
  Framework and Characterization for Bandit Learnability
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit LearnabilityNeural Information Processing Systems (NeurIPS), 2024
Fan Chen
Dylan J. Foster
Yanjun Han
Jian Qian
Alexander Rakhlin
Yunbei Xu
204
3
0
07 Oct 2024
Finite-Sample Analysis of the Monte Carlo Exploring Starts Algorithm for
  Reinforcement Learning
Finite-Sample Analysis of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning
Suei-Wen Chen
Keith Ross
Pierre Youssef
158
1
0
03 Oct 2024
State-free Reinforcement Learning
State-free Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
191
0
0
27 Sep 2024
Optimistic Q-learning for average reward and episodic reinforcement learning
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
341
6
0
18 Jul 2024
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
D. Tiapkin
Evgenii Chzhen
Jean-Michel Poggi
264
1
0
08 Jul 2024
Oracle-Efficient Reinforcement Learning for Max Value Ensembles
Oracle-Efficient Reinforcement Learning for Max Value Ensembles
Marcel Hussing
Michael Kearns
Aaron Roth
S. B. Sengupta
Jessica Sorrell
170
0
0
27 May 2024
Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks
Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks
Shaunak A. Mehta
Soheil Habibian
Dylan P. Losey
SSL
197
6
0
20 Mar 2024
The Value of Reward Lookahead in Reinforcement Learning
The Value of Reward Lookahead in Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Nadav Merlis
Dorian Baudry
Vianney Perchet
183
1
0
18 Mar 2024
Horizon-Free Regret for Linear Markov Decision Processes
Horizon-Free Regret for Linear Markov Decision Processes
Zihan Zhang
Jason D. Lee
Yuxin Chen
Simon S. Du
147
3
0
15 Mar 2024
Truly No-Regret Learning in Constrained MDPs
Truly No-Regret Learning in Constrained MDPs
Adrian Müller
Pragnya Alatur
Volkan Cevher
Giorgia Ramponi
Niao He
281
15
0
24 Feb 2024
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Meshal Alharbi
Mardavij Roozbehani
M. Dahleh
216
4
0
19 Dec 2023
The Effective Horizon Explains Deep RL Performance in Stochastic
  Environments
The Effective Horizon Explains Deep RL Performance in Stochastic EnvironmentsInternational Conference on Learning Representations (ICLR), 2023
Cassidy Laidlaw
Banghua Zhu
Stuart J. Russell
Anca Dragan
258
5
0
13 Dec 2023
Probabilistic Inference in Reinforcement Learning Done Right
Probabilistic Inference in Reinforcement Learning Done RightNeural Information Processing Systems (NeurIPS), 2023
Jean Tarbouriech
Tor Lattimore
Brendan O'Donoghue
BDLOffRL
198
9
0
22 Nov 2023
Learning Adversarial Low-rank Markov Decision Processes with Unknown
  Transition and Full-information Feedback
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information FeedbackNeural Information Processing Systems (NeurIPS), 2023
Canzhe Zhao
Ruofeng Yang
Baoxiang Wang
Xuezhou Zhang
Shuai Li
181
4
0
14 Nov 2023
Towards Instance-Optimality in Online PAC Reinforcement Learning
Towards Instance-Optimality in Online PAC Reinforcement Learning
Aymen Al Marjani
Andrea Tirinzoni
Emilie Kaufmann
OffRL
197
5
0
31 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
When is Agnostic Reinforcement Learning Statistically Tractable?Neural Information Processing Systems (NeurIPS), 2023
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
252
7
0
09 Oct 2023
Learning to Make Adherence-Aware Advice
Learning to Make Adherence-Aware AdviceInternational Conference on Learning Representations (ICLR), 2023
Guanting Chen
Xiaocheng Li
Chunlin Sun
Hanzhao Wang
116
15
0
01 Oct 2023
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
Zihan Zhou
Honghao Wei
Lei Ying
OffRL
366
1
0
27 Sep 2023
Minimax Optimal Q Learning with Nearest Neighbors
Minimax Optimal Q Learning with Nearest NeighborsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023
Puning Zhao
Lifeng Lai
OffRL
225
14
0
03 Aug 2023
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement LearningAnnual Conference Computational Learning Theory (COLT), 2023
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
618
34
0
25 Jul 2023
Near-optimal Conservative Exploration in Reinforcement Learning under
  Episode-wise Constraints
Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise ConstraintsInternational Conference on Machine Learning (ICML), 2023
Donghao Li
Ruiquan Huang
Cong Shen
Jing Yang
198
4
0
09 Jun 2023
Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz
  Dynamic Risk Measures
Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk MeasuresInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Hao Liang
Zhihui Luo
227
5
0
04 Jun 2023
Efficient Reinforcement Learning with Impaired Observability: Learning
  to Act with Delayed and Missing State Observations
Efficient Reinforcement Learning with Impaired Observability: Learning to Act with Delayed and Missing State ObservationsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023
Minshuo Chen
Jie Meng
Yunru Bai
Yinyu Ye
H. Vincent Poor
Mengdi Wang
253
1
0
02 Jun 2023
Differentially Private Episodic Reinforcement Learning with Heavy-tailed
  Rewards
Differentially Private Episodic Reinforcement Learning with Heavy-tailed RewardsInternational Conference on Machine Learning (ICML), 2023
Yulian Wu
Xingyu Zhou
Sayak Ray Chowdhury
Haiyan Zhao
308
3
0
01 Jun 2023
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid
  Reinforcement Learning
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Gen Li
Wenhao Zhan
Jason D. Lee
Yuejie Chi
Yuxin Chen
OffRLOnRL
236
16
0
17 May 2023
Towards Theoretical Understanding of Inverse Reinforcement Learning
Towards Theoretical Understanding of Inverse Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Alberto Maria Metelli
Filippo Lazzati
Marcello Restelli
151
19
0
25 Apr 2023
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement LearningAnnual Conference Computational Learning Theory (COLT), 2023
Gen Li
Yuling Yan
Yuxin Chen
Jianqing Fan
OffRL
236
15
0
14 Apr 2023
Improved Sample Complexity for Reward-free Reinforcement Learning under
  Low-rank MDPs
Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPsInternational Conference on Learning Representations (ICLR), 2023
Yuan Cheng
Ruiquan Huang
J. Yang
Yitao Liang
OffRL
190
9
0
20 Mar 2023
Fast Rates for Maximum Entropy Exploration
Fast Rates for Maximum Entropy ExplorationInternational Conference on Machine Learning (ICML), 2023
D. Tiapkin
Denis Belomestny
Daniele Calandriello
Eric Moulines
Rémi Munos
A. Naumov
Pierre Perrault
Yunhao Tang
Michal Valko
Pierre Menard
212
24
0
14 Mar 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic EnvironmentsInternational Conference on Machine Learning (ICML), 2023
Runlong Zhou
Zihan Zhang
S. Du
240
16
0
31 Jan 2023
Regret Bounds for Markov Decision Processes with Recursive Optimized
  Certainty Equivalents
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty EquivalentsInternational Conference on Machine Learning (ICML), 2023
Wenkun Xu
Ningyuan Chen
X. He
196
13
0
30 Jan 2023
Adversarial Online Multi-Task Reinforcement Learning
Adversarial Online Multi-Task Reinforcement LearningInternational Conference on Algorithmic Learning Theory (ALT), 2023
Quan Nguyen
Nishant A. Mehta
111
1
0
11 Jan 2023
Model-Free Reinforcement Learning with the Decision-Estimation
  Coefficient
Model-Free Reinforcement Learning with the Decision-Estimation CoefficientNeural Information Processing Systems (NeurIPS), 2022
Dylan J. Foster
Noah Golowich
Jian Qian
Alexander Rakhlin
Ayush Sekhari
OffRL
181
12
0
25 Nov 2022
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness
  to Model Misspecification
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model MisspecificationNeural Information Processing Systems (NeurIPS), 2022
Takumi Tanabe
Reimi Sato
Kazuto Fukuchi
Jun Sakuma
Youhei Akimoto
OffRL
214
14
0
07 Nov 2022
Bridging Distributional and Risk-sensitive Reinforcement Learning with
  Provable Regret Bounds
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret BoundsJournal of machine learning research (JMLR), 2022
Hao Liang
Zhihui Luo
289
18
0
25 Oct 2022
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent
  Markov Decision Processes
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision ProcessesInternational Conference on Machine Learning (ICML), 2022
Runlong Zhou
Ruosong Wang
S. Du
210
3
0
20 Oct 2022
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with
  Tractable Exploration and Planning
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration and PlanningAAAI Conference on Artificial Intelligence (AAAI), 2022
Reda Ouhamma
D. Basu
Odalric-Ambrym Maillard
OffRL
158
12
0
05 Oct 2022
Square-root regret bounds for continuous-time episodic Markov decision
  processes
Square-root regret bounds for continuous-time episodic Markov decision processesMathematics of Operations Research (MOR), 2022
Ningyuan Chen
X. Zhou
252
6
0
03 Oct 2022
12
Next