ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.02373
  4. Cited By
Near Optimal Exploration-Exploitation in Non-Communicating Markov
  Decision Processes
v1v2 (latest)

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes

6 July 2018
Ronan Fruit
Matteo Pirotta
A. Lazaric
ArXiv (abs)PDFHTML

Papers citing "Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes"

29 / 29 papers shown
Title
Model Selection for Average Reward RL with Application to Utility
  Maximization in Repeated Games
Model Selection for Average Reward RL with Application to Utility Maximization in Repeated Games
Alireza Masoumian
James R. Wright
142
1
0
09 Nov 2024
Beyond Optimism: Exploration With Partially Observable Rewards
Beyond Optimism: Exploration With Partially Observable Rewards
Simone Parisi
Alireza Kazemipour
Michael Bowling
OffRL
96
2
0
20 Jun 2024
Finding good policies in average-reward Markov Decision Processes
  without prior knowledge
Finding good policies in average-reward Markov Decision Processes without prior knowledge
Adrienne Tuynman
Rémy Degenne
Emilie Kaufmann
100
4
0
27 May 2024
Span-Based Optimal Sample Complexity for Weakly Communicating and
  General Average Reward MDPs
Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs
M. Zurek
Yudong Chen
73
6
0
18 Mar 2024
Dealing with unbounded gradients in stochastic saddle-point optimization
Dealing with unbounded gradients in stochastic saddle-point optimization
Gergely Neu
Nneka Okolo
87
5
0
21 Feb 2024
A Study of Global and Episodic Bonuses for Exploration in Contextual
  MDPs
A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
Mikael Henaff
Minqi Jiang
Roberta Raileanu
86
13
0
05 Jun 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
87
12
0
31 Jan 2023
Multi-Armed Bandits with Self-Information Rewards
Multi-Armed Bandits with Self-Information Rewards
Nir Weinberger
M. Yemini
22
4
0
06 Sep 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for
  Discounted MDPs
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs
Ian A. Kash
L. Reyzin
Zishun Yu
101
0
0
18 May 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of
  Stationary Policies
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
83
25
0
24 Mar 2022
Near-Optimal Randomized Exploration for Tabular Markov Decision
  Processes
Near-Optimal Randomized Exploration for Tabular Markov Decision Processes
Zhihan Xiong
Ruoqi Shen
Qiwen Cui
Maryam Fazel
S. Du
85
10
0
19 Feb 2021
Nearly Minimax Optimal Regret for Learning Infinite-horizon
  Average-reward MDPs with Linear Function Approximation
Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation
Yue Wu
Dongruo Zhou
Quanquan Gu
62
21
0
15 Feb 2021
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal
  Algorithm Escaping the Curse of Horizon
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
128
107
0
28 Sep 2020
Adaptive KL-UCB based Bandit Algorithms for Markovian and i.i.d.
  Settings
Adaptive KL-UCB based Bandit Algorithms for Markovian and i.i.d. Settings
Member Ieee Arghyadip Roy
Fellow Ieee Sanjay Shakkottai
F. I. R. Srikant
46
2
0
14 Sep 2020
Reinforcement Learning for Non-Stationary Markov Decision Processes: The
  Blessing of (More) Optimism
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
98
96
0
24 Jun 2020
A Survey of Reinforcement Learning Algorithms for Dynamically Varying
  Environments
A Survey of Reinforcement Learning Algorithms for Dynamically Varying Environments
Sindhu Padakandla
75
155
0
19 May 2020
Learning Algorithms for Minimizing Queue Length Regret
Learning Algorithms for Minimizing Queue Length Regret
Thomas Stahlbuhk
B. Shrader
E. Modiano
17
2
0
11 May 2020
Tightening Exploration in Upper Confidence Reinforcement Learning
Tightening Exploration in Upper Confidence Reinforcement Learning
Hippolyte Bourel
Odalric-Ambrym Maillard
M. S. Talebi
71
31
0
20 Apr 2020
Conservative Exploration in Reinforcement Learning
Conservative Exploration in Reinforcement Learning
Evrard Garcelon
Mohammad Ghavamzadeh
A. Lazaric
Matteo Pirotta
80
28
0
08 Feb 2020
No-Regret Exploration in Goal-Oriented Reinforcement Learning
No-Regret Exploration in Goal-Oriented Reinforcement Learning
Jean Tarbouriech
Evrard Garcelon
Michal Valko
Matteo Pirotta
A. Lazaric
105
46
0
07 Dec 2019
Performance Effectiveness of Multimedia Information Search Using the
  Epsilon-Greedy Algorithm
Performance Effectiveness of Multimedia Information Search Using the Epsilon-Greedy Algorithm
Nikki Lijing Kuang
C. Leung
25
8
0
22 Nov 2019
The Restless Hidden Markov Bandit with Linear Rewards and Side
  Information
The Restless Hidden Markov Bandit with Linear Rewards and Side Information
M. Yemini
Amir Leshem
A. Somekh-Baruch
84
4
0
22 Oct 2019
Model-free Reinforcement Learning in Infinite-horizon Average-reward
  Markov Decision Processes
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
168
108
0
15 Oct 2019
Maximum Expected Hitting Cost of a Markov Decision Process and
  Informativeness of Rewards
Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards
Falcon Z. Dai
Matthew R. Walter
32
6
0
03 Jul 2019
Regret Minimization for Reinforcement Learning by Evaluating the Optimal
  Bias Function
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
82
72
0
12 Jun 2019
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
58
7
0
07 Jun 2019
Exploration-Exploitation Trade-off in Reinforcement Learning on Online
  Markov Decision Processes with Global Concave Rewards
Exploration-Exploitation Trade-off in Reinforcement Learning on Online Markov Decision Processes with Global Concave Rewards
Wang Chi Cheung
51
18
0
15 May 2019
Exploration Bonus for Regret Minimization in Undiscounted Discrete and
  Continuous Markov Decision Processes
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes
Jian Qian
Ronan Fruit
Matteo Pirotta
A. Lazaric
47
10
0
11 Dec 2018
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
R. Ortner
93
46
0
06 Aug 2018
1