ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1703.07710
  4. Cited By
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement
  Learning
v1v2v3 (latest)

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

22 March 2017
Christoph Dann
Tor Lattimore
Emma Brunskill
ArXiv (abs)PDFHTML

Papers citing "Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning"

50 / 229 papers shown
What Fundamental Structure in Reward Functions Enables Efficient Sparse-Reward Learning?
What Fundamental Structure in Reward Functions Enables Efficient Sparse-Reward Learning?
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
OffRL
189
1
0
04 Sep 2025
ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning
ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning
Debamita Ghosh
George Atia
Yue Wang
OffRLOOD
440
3
0
05 Aug 2025
Probably Approximately Correct Causal Discovery
Probably Approximately Correct Causal Discovery
Mian Wei
S. Jha
David Page
CML
155
0
0
25 Jul 2025
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
Jiachen Hu
Rui Ai
Han Zhong
Xiaoyu Chen
L. Wang
Zhaoran Wang
Zhuoran Yang
249
0
0
11 Jun 2025
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RLInternational Conference on Learning Representations (ICLR), 2025
Yu-Heng Hung
Kai-Jie Lin
Yu-Heng Lin
Chien-Yi Wang
Cheng Sun
Ping-Chun Hsieh
381
6
0
28 May 2025
An Optimistic Algorithm for online CMDPS with Anytime Adversarial Constraints
An Optimistic Algorithm for online CMDPS with Anytime Adversarial Constraints
Jiahui Zhu
Kihyun Yu
Dabeen Lee
Xin Liu
Honghao Wei
263
1
0
28 May 2025
Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning
Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2025
Shijie Liu
Andrew C. Cullen
Paul Montague
S. Erfani
Benjamin I. P. Rubinstein
OffRLAAML
293
4
0
27 May 2025
Deep Actor-Critics with Tight Risk Certificates
Deep Actor-Critics with Tight Risk Certificates
Bahareh Tasdighi
Manuel Haussmann
Yi-Shan Wu
A. Masegosa
M. Kandemir
UQCV
543
0
0
26 May 2025
Automatic Reward Shaping from Confounded Offline Data
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRLOnRL
580
4
0
16 May 2025
Toward Efficient Exploration by Large Language Model Agents
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
473
12
0
29 Apr 2025
Towards Optimal Differentially Private Regret Bounds in Linear MDPs
Towards Optimal Differentially Private Regret Bounds in Linear MDPs
Sharan Sahu
506
0
0
12 Apr 2025
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
Qian Zuo
Fengxiang He
384
0
0
07 Apr 2025
Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative Model
Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative ModelInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Zilong Deng
Simon Khan
Shaofeng Zou
604
2
0
11 Mar 2025
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Minimax Optimal Reinforcement Learning with Quasi-OptimismInternational Conference on Learning Representations (ICLR), 2025
Harin Lee
Min-hwan Oh
OffRL
420
2
0
02 Mar 2025
Near-Optimal Reinforcement Learning with Shuffle Differential Privacy
Shaojie Bai
Mohammad Sadegh Talebi
Chengcheng Zhao
Peng Cheng
Jiming Chen
OffRL
517
0
0
18 Nov 2024
Individual Regret in Cooperative Stochastic Multi-Armed Bandits
Individual Regret in Cooperative Stochastic Multi-Armed Bandits
Idan Barnea
Tal Lancewicki
Yishay Mansour
205
1
0
10 Nov 2024
Learning in Markov Games with Adaptive Adversaries: Policy Regret,
  Fundamental Barriers, and Efficient Algorithms
Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient AlgorithmsNeural Information Processing Systems (NeurIPS), 2024
Thanh Nguyen-Tang
Raman Arora
446
1
0
01 Nov 2024
Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent
  Misspecification
Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent MisspecificationNeural Information Processing Systems (NeurIPS), 2024
Haolin Liu
Artin Tajdini
Andrew Wagenmaker
Chen-Yu Wei
556
3
0
10 Oct 2024
State-free Reinforcement Learning
State-free Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
368
0
0
27 Sep 2024
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph FormInternational Conference on Learning Representations (ICLR), 2024
Toshinori Kitamura
Tadashi Kozuno
Wataru Kumagai
Kenta Hoshino
Y. Hosoe
Kazumi Kasaura
Masashi Hamaya
Paavo Parmas
Yutaka Matsuo
747
8
0
29 Aug 2024
Satisficing Exploration for Deep Reinforcement Learning
Satisficing Exploration for Deep Reinforcement Learning
Dilip Arumugam
Saurabh Kumar
Ramki Gummadi
Benjamin Van Roy
287
3
0
16 Jul 2024
Learning to Steer Markovian Agents under Model Uncertainty
Learning to Steer Markovian Agents under Model Uncertainty
Jiawei Huang
Vinzenz Thoma
Zebang Shen
H. Nax
Niao He
518
4
0
14 Jul 2024
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
D. Tiapkin
Evgenii Chzhen
Jean-Michel Poggi
389
1
0
08 Jul 2024
Fast Rates for Bandit PAC Multiclass Classification
Fast Rates for Bandit PAC Multiclass Classification
Liad Erez
Alon Cohen
Tomer Koren
Yishay Mansour
Shay Moran
323
4
0
18 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
464
3
0
11 Jun 2024
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization
  by Large Step Sizes
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes
Dan Qiao
Kaiqi Zhang
Esha Singh
Daniel Soudry
Yu-Xiang Wang
NoLa
351
9
0
10 Jun 2024
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Xutong Liu
Siwei Wang
Jinhang Zuo
Han Zhong
Xuchuang Wang
Zhiyong Wang
Shuai Li
Mohammad Hajiesmaili
J. C. Lui
Wei Chen
543
9
0
03 Jun 2024
Differentially Private Reinforcement Learning with Self-Play
Differentially Private Reinforcement Learning with Self-Play
Dan Qiao
Yu Wang
286
0
0
11 Apr 2024
Distributionally Robust Reinforcement Learning with Interactive Data
  Collection: Fundamental Hardness and Near-Optimal Algorithm
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal AlgorithmNeural Information Processing Systems (NeurIPS), 2024
Miao Lu
Han Zhong
Tong Zhang
Jose H. Blanchet
OffRLOOD
289
22
0
04 Apr 2024
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
Srinjoy Roy
Swagatam Das
358
0
0
31 Mar 2024
Sample Efficient Myopic Exploration Through Multitask Reinforcement
  Learning with Diverse Tasks
Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks
Ziping Xu
Zifan Xu
Runxuan Jiang
Peter Stone
Ambuj Tewari
431
2
0
03 Mar 2024
Truly No-Regret Learning in Constrained MDPs
Truly No-Regret Learning in Constrained MDPs
Adrian Müller
Pragnya Alatur
Volkan Cevher
Giorgia Ramponi
Niao He
446
17
0
24 Feb 2024
Double Duality: Variational Primal-Dual Policy Optimization for
  Constrained Reinforcement Learning
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
Zihao Li
Boyi Liu
Zhuoran Yang
Zhaoran Wang
Mengdi Wang
344
2
0
16 Feb 2024
TransAxx: Efficient Transformers with Approximate Computing
TransAxx: Efficient Transformers with Approximate Computing
Dimitrios Danopoulos
Georgios Zervakis
Dimitrios Soudris
Jörg Henkel
ViT
377
7
0
12 Feb 2024
Sample Complexity Characterization for Linear Contextual MDPs
Sample Complexity Characterization for Linear Contextual MDPs
Junze Deng
Yuan Cheng
Shaofeng Zou
Yingbin Liang
245
6
0
05 Feb 2024
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity
  Constraints
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
Dan Qiao
Yu Wang
OffRL
332
5
0
02 Feb 2024
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with
  Uniform PAC Guarantees
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Toshinori Kitamura
Tadashi Kozuno
Masahiro Kato
Yuki Ichihara
Soichiro Nishimori
Akiyoshi Sannai
Sho Sonoda
Wataru Kumagai
Yutaka Matsuo
568
5
0
31 Jan 2024
Behind the Myth of Exploration in Policy Gradients
Behind the Myth of Exploration in Policy Gradients
Adrien Bolland
Gaspard Lambrechts
Damien Ernst
474
3
0
31 Jan 2024
Cascading Reinforcement Learning
Cascading Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2024
Yihan Du
R. Srikant
Wei Chen
315
2
0
17 Jan 2024
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Meshal Alharbi
Mardavij Roozbehani
M. Dahleh
337
4
0
19 Dec 2023
Accelerating Exploration with Unlabeled Prior Data
Accelerating Exploration with Unlabeled Prior Data
Qiyang Li
Jason Zhang
Dibya Ghosh
Amy Zhang
Sergey Levine
OffRLOnRL
472
18
0
09 Nov 2023
A Doubly Robust Approach to Sparse Reinforcement Learning
A Doubly Robust Approach to Sparse Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Wonyoung Hedge Kim
Garud Iyengar
A. Zeevi
246
5
0
23 Oct 2023
Learning to Make Adherence-Aware Advice
Learning to Make Adherence-Aware AdviceInternational Conference on Learning Representations (ICLR), 2023
Guanting Chen
Xiaocheng Li
Chunlin Sun
Hanzhao Wang
279
15
0
01 Oct 2023
Pure Exploration under Mediators' Feedback
Pure Exploration under Mediators' Feedback
Riccardo Poiani
Alberto Maria Metelli
Marcello Restelli
264
1
0
29 Aug 2023
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov
  Games
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2023
Songtao Feng
Ming Yin
Yu Wang
J. Yang
Yitao Liang
190
1
0
17 Aug 2023
Dyadic Reinforcement Learning
Dyadic Reinforcement Learning
Shuangning Li
L. Niell
S. Choi
Inbal Nahum-Shani
Guy Shani
Susan Murphy
OffRL
260
2
0
15 Aug 2023
Provably Efficient Algorithm for Nonstationary Low-Rank MDPs
Provably Efficient Algorithm for Nonstationary Low-Rank MDPsNeural Information Processing Systems (NeurIPS), 2023
Yuan Cheng
J. Yang
Yitao Liang
OOD
256
1
0
10 Aug 2023
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement LearningAnnual Conference Computational Learning Theory (COLT), 2023
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
884
42
0
25 Jul 2023
Efficient Action Robust Reinforcement Learning with Probabilistic Policy
  Execution Uncertainty
Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty
Guanin Liu
Zhihan Zhou
Han Liu
Lifeng Lai
388
5
0
15 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments
  using Offline Data
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline DataNeural Information Processing Systems (NeurIPS), 2023
Ruiqi Zhang
Andrea Zanette
OffRLOnRL
343
11
0
10 Jul 2023
12345
Next
Page 1 of 5