RL for Latent MDPs: Regret Guarantees and a Lower Bound

Neural Information Processing Systems (NeurIPS), 2021

9 February 2021

Jeongyeol Kwon

Yonathan Efroni

Constantine Caramanis

Shie Mannor

ArXiv (abs)PDF HTML Github

Papers citing "RL for Latent MDPs: Regret Guarantees and a Lower Bound"

50 / 64 papers shown

Representative Action Selection for Large Action Space: From Bandits to MDPs

Quan Zhou

Shie Mannor

101

27 Nov 2025

SAC-MoE: Reinforcement Learning with Mixture-of-Experts for Control of Hybrid Dynamical Systems with Uncertainty

Leroy D'Souza

Akash Karthikeyan

Yash Vardhan Pant

Sebastian Fischmeister

158

15 Nov 2025

Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability

175

27 Oct 2025

To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable Reinforcement Learning

205

03 Oct 2025

SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer

427

23 Sep 2025

Statistical Guarantees for Offline Domain Randomization

349

11 Jun 2025

Near-Optimal Clustering in Mixture of Markov Chains

342

02 Jun 2025

Situationally-Aware Dynamics Learning

Alejandro Murillo-Gonzalez

Lantao Liu

398

26 May 2025

Model-based controller assisted domain randomization for transient vibration suppression of nonlinear powertrain system with parametric uncertainty

Heisei Yonezawa

Ansei Yonezawa

Itsuro Kajiwara

399

28 Apr 2025

Improving Controller Generalization with Dimensionless Markov Decision Processes

V. Charvet

Sebastian Stein

R. Murray-Smith

359

14 Apr 2025

A Classification View on Meta Learning Bandits

304

06 Apr 2025

Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPsInternational Conference on Learning Representations (ICLR), 2025

Yuheng Zhang

Nan Jiang

OffRL

305

03 Mar 2025

Learning to Cooperate with Humans using Generative AgentsNeural Information Processing Systems (NeurIPS), 2024

332

21 Nov 2024

Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics DataInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

312

06 Nov 2024

Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient AlgorithmsNeural Information Processing Systems (NeurIPS), 2024

Thanh Nguyen-Tang

Raman Arora

447

01 Nov 2024

Test-Time Regret Minimization in Meta Reinforcement Learning

Mirco Mutti

Aviv Tamar

335

04 Jun 2024

RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation

Jeongyeol Kwon

Shie Mannor

Constantine Caramanis

Yonathan Efroni

OffRL

452

03 Jun 2024

A CMDP-within-online framework for Meta-Safe Reinforcement Learning

Ming Jin

309

26 May 2024

Pausing Policy Learning in Non-stationary Reinforcement Learning

Hyunin Lee

Ming Jin

Javad Lavaei

Somayeh Sojoudi

OffRL

260

25 May 2024

Preparing for Black Swans: The Antifragility Imperative for Machine Learning

Ming Jin

359

18 May 2024

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

284

25 Feb 2024

On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation

Yuheng Zhang

Nan Jiang

OffRL

328

22 Feb 2024

Weakly Coupled Deep Q-NetworksNeural Information Processing Systems (NeurIPS), 2023

Ibrahim El Shar

Daniel R. Jiang

263

28 Oct 2023

Prospective Side Information for Latent MDPsInternational Conference on Machine Learning (ICML), 2023

Jeongyeol Kwon

Yonathan Efroni

Shie Mannor

Constantine Caramanis

386

11 Oct 2023

Tempo Adaptation in Non-stationary Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

Ming Jin

Somayeh Sojoudi

316

26 Sep 2023

JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning

445

21 Jul 2023

Sample-Efficient Learning of POMDPs with Multiple Observations In HindsightInternational Conference on Learning Representations (ICLR), 2023

Mengdi Wang

302

06 Jul 2023

Provably Efficient UCB-type Algorithms For Learning Predictive State RepresentationsInternational Conference on Learning Representations (ICLR), 2023

437

01 Jul 2023

Context-lumpable stochastic banditsNeural Information Processing Systems (NeurIPS), 2023

385

22 Jun 2023

Provably Efficient Offline Reinforcement Learning with Perturbed Data SourcesInternational Conference on Machine Learning (ICML), 2023

Chengshuai Shi

Wei Xiong

Cong Shen

Jing Yang

OffRL

280

14 Jun 2023

Near-Optimal Partially Observable Reinforcement Learning with Partial Online State Information

Ming Shi

Yingbin Liang

Ness B. Shroff

414

14 Jun 2023

Representations and Exploration for Deep Reinforcement Learning using Singular Value DecompositionInternational Conference on Machine Learning (ICML), 2023

360

01 May 2023

Hardness of Independent Learning and Sparse Equilibrium Computation in Markov GamesInternational Conference on Machine Learning (ICML), 2023

Dylan J. Foster

Noah Golowich

Sham Kakade

287

22 Mar 2023

POPGym: Benchmarking Partially Observable Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023

Amanda Prorok

286

03 Mar 2023

Reinforcement Learning with History-Dependent Dynamic ContextsInternational Conference on Machine Learning (ICML), 2023

297

04 Feb 2023

Learning in POMDPs is Sample-Efficient with Hindsight ObservabilityInternational Conference on Machine Learning (ICML), 2023

Jonathan Lee

Alekh Agarwal

Christoph Dann

Tong Zhang

374

31 Jan 2023

Adversarial Online Multi-Task Reinforcement LearningInternational Conference on Algorithmic Learning Theory (ALT), 2023

Quan Nguyen

Nishant A. Mehta

211

11 Jan 2023

An Instrumental Variable Approach to Confounded Off-Policy EvaluationInternational Conference on Machine Learning (ICML), 2022

365

29 Dec 2022

Offline Policy Evaluation and Optimization under ConfoundingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022

431

29 Nov 2022

Learning Mixtures of Markov Chains and MDPsInternational Conference on Machine Learning (ICML), 2022

Chinmaya Kausik

Kevin Tan

Ambuj Tewari

338

17 Nov 2022

Group Distributionally Robust Reinforcement Learning with Hierarchical Latent VariablesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022

...

Ding Zhao

285

21 Oct 2022

Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision ProcessesInternational Conference on Machine Learning (ICML), 2022

Runlong Zhou

Ruosong Wang

S. Du

429

20 Oct 2022

Tractable Optimality in Episodic Latent MABsNeural Information Processing Systems (NeurIPS), 2022

Jeongyeol Kwon

Yonathan Efroni

Constantine Caramanis

Shie Mannor

323

05 Oct 2022

Reward-Mixing MDPs with a Few Latent Contexts are Learnable

Jeongyeol Kwon

Yonathan Efroni

Constantine Caramanis

Shie Mannor

213

05 Oct 2022

Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient AlgorithmsInternational Conference on Learning Representations (ICLR), 2022

Fan Chen

Yu Bai

Song Mei

352

29 Sep 2022

Future-Dependent Value-Based Off-Policy Evaluation in POMDPsNeural Information Processing Systems (NeurIPS), 2022

509

26 Jul 2022

PAC Reinforcement Learning for Predictive State RepresentationsInternational Conference on Learning Representations (ICLR), 2022

554

12 Jul 2022

On the Complexity of Adversarial Decision MakingNeural Information Processing Systems (NeurIPS), 2022

292

27 Jun 2022

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional EmbeddingsInternational Conference on Machine Learning (ICML), 2022

277

24 Jun 2022

Provably Efficient Reinforcement Learning in Partially Observable Dynamical SystemsNeural Information Processing Systems (NeurIPS), 2022

323

24 Jun 2022