Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2002.05138
Cited By
v1
v2
v3 (latest)
Regret Bounds for Discounted MDPs
12 February 2020
Shuang Liu
H. Su
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Regret Bounds for Discounted MDPs"
16 / 16 papers shown
Non-Stationary Restless Multi-Armed Bandits with Provable Guarantee
Yu-Heng Hung
Ping-Chun Hsieh
Kai Wang
113
0
0
14 Aug 2025
Reinforcement Learning from Multi-level and Episodic Human Feedback
Conference on Learning for Dynamics & Control (L4DC), 2025
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
591
0
0
20 Apr 2025
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRL
OnRL
307
6
0
06 Nov 2024
A Factored MDP Approach To Moving Target Defense With Dynamic Threat Modeling and Cost Efficiency
Megha Bose
P. Paruchuri
Akshat Kumar
AAML
194
0
0
16 Aug 2024
Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs via Approximation by Discounted-Reward MDPs
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Kihyuk Hong
Yufan Zhang
Ambuj Tewari
Dabeen Lee
Ambuj Tewari
465
1
0
23 May 2024
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Neural Information Processing Systems (NeurIPS), 2023
Xiang Ji
Gen Li
OffRL
431
9
0
24 May 2023
Optimistic Planning by Regularized Dynamic Programming
International Conference on Machine Learning (ICML), 2023
Antoine Moulin
Gergely Neu
489
8
0
27 Feb 2023
No-regret Learning in Repeated First-Price Auctions with Budget Constraints
Rui Ai
Chang Wang
Chenchen Li
Jinshan Zhang
Wenhan Huang
Xiaotie Deng
277
14
0
29 May 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs
International Conference on Algorithmic Learning Theory (ALT), 2022
Ian A. Kash
L. Reyzin
Zishun Yu
475
1
0
18 May 2022
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
358
4
0
21 Apr 2022
Gap-Dependent Bounds for Two-Player Markov Games
Zehao Dou
Zhuoran Yang
Zhaoran Wang
S. Du
140
8
0
01 Jul 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
Neural Information Processing Systems (NeurIPS), 2021
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
253
50
0
18 Jun 2021
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes
Annual Conference Computational Learning Theory (COLT), 2020
Dongruo Zhou
Quanquan Gu
Csaba Szepesvári
334
228
0
15 Dec 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Neural Information Processing Systems (NeurIPS), 2020
Jiafan He
Dongruo Zhou
Quanquan Gu
497
47
0
01 Oct 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
International Conference on Machine Learning (ICML), 2020
Dongruo Zhou
Jiafan He
Quanquan Gu
439
142
0
23 Jun 2020
Q
Q
Q
-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
416
72
0
16 Jun 2020
1
Page 1 of 1