ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05138
  4. Cited By
Regret Bounds for Discounted MDPs
v1v2v3 (latest)

Regret Bounds for Discounted MDPs

12 February 2020
Shuang Liu
H. Su
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Regret Bounds for Discounted MDPs"

16 / 16 papers shown
Non-Stationary Restless Multi-Armed Bandits with Provable Guarantee
Non-Stationary Restless Multi-Armed Bandits with Provable Guarantee
Yu-Heng Hung
Ping-Chun Hsieh
Kai Wang
113
0
0
14 Aug 2025
Reinforcement Learning from Multi-level and Episodic Human Feedback
Reinforcement Learning from Multi-level and Episodic Human FeedbackConference on Learning for Dynamics & Control (L4DC), 2025
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
591
0
0
20 Apr 2025
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from
  Shifted-Dynamics Data
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics DataInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRLOnRL
307
6
0
06 Nov 2024
A Factored MDP Approach To Moving Target Defense With Dynamic Threat
  Modeling and Cost Efficiency
A Factored MDP Approach To Moving Target Defense With Dynamic Threat Modeling and Cost Efficiency
Megha Bose
P. Paruchuri
Akshat Kumar
AAML
194
0
0
16 Aug 2024
Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs via Approximation by Discounted-Reward MDPs
Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs via Approximation by Discounted-Reward MDPsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Kihyuk Hong
Yufan Zhang
Ambuj Tewari
Dabeen Lee
Ambuj Tewari
465
1
0
23 May 2024
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs
  with Short Burn-In Time
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In TimeNeural Information Processing Systems (NeurIPS), 2023
Xiang Ji
Gen Li
OffRL
431
9
0
24 May 2023
Optimistic Planning by Regularized Dynamic Programming
Optimistic Planning by Regularized Dynamic ProgrammingInternational Conference on Machine Learning (ICML), 2023
Antoine Moulin
Gergely Neu
489
8
0
27 Feb 2023
No-regret Learning in Repeated First-Price Auctions with Budget
  Constraints
No-regret Learning in Repeated First-Price Auctions with Budget Constraints
Rui Ai
Chang Wang
Chenchen Li
Jinshan Zhang
Wenhan Huang
Xiaotie Deng
277
14
0
29 May 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for
  Discounted MDPs
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPsInternational Conference on Algorithmic Learning Theory (ALT), 2022
Ian A. Kash
L. Reyzin
Zishun Yu
475
1
0
18 May 2022
Provably Efficient Kernelized Q-Learning
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
358
4
0
21 Apr 2022
Gap-Dependent Bounds for Two-Player Markov Games
Gap-Dependent Bounds for Two-Player Markov Games
Zehao Dou
Zhuoran Yang
Zhaoran Wang
S. Du
140
8
0
01 Jul 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
MADE: Exploration via Maximizing Deviation from Explored RegionsNeural Information Processing Systems (NeurIPS), 2021
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
253
50
0
18 Jun 2021
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov
  Decision Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision ProcessesAnnual Conference Computational Learning Theory (COLT), 2020
Dongruo Zhou
Quanquan Gu
Csaba Szepesvári
334
228
0
15 Dec 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPsNeural Information Processing Systems (NeurIPS), 2020
Jiafan He
Dongruo Zhou
Quanquan Gu
497
47
0
01 Oct 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with
  Feature Mapping
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature MappingInternational Conference on Machine Learning (ICML), 2020
Dongruo Zhou
Jiafan He
Quanquan Gu
439
142
0
23 Jun 2020
$Q$-learning with Logarithmic Regret
QQQ-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
416
72
0
16 Jun 2020
1
Page 1 of 1