ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.04939
  4. Cited By
RL for Latent MDPs: Regret Guarantees and a Lower Bound

RL for Latent MDPs: Regret Guarantees and a Lower Bound

Neural Information Processing Systems (NeurIPS), 2021
9 February 2021
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
ArXiv (abs)PDFHTML

Papers citing "RL for Latent MDPs: Regret Guarantees and a Lower Bound"

50 / 64 papers shown
Representative Action Selection for Large Action Space: From Bandits to MDPs
Representative Action Selection for Large Action Space: From Bandits to MDPs
Quan Zhou
Shie Mannor
101
0
0
27 Nov 2025
SAC-MoE: Reinforcement Learning with Mixture-of-Experts for Control of Hybrid Dynamical Systems with Uncertainty
SAC-MoE: Reinforcement Learning with Mixture-of-Experts for Control of Hybrid Dynamical Systems with Uncertainty
Leroy D'Souza
Akash Karthikeyan
Yash Vardhan Pant
Sebastian Fischmeister
145
1
0
15 Nov 2025
Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability
Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability
Eline M. Bovy
Caleb Probine
Marnix Suilen
Ufuk Topcu
Nils Jansen
172
0
0
27 Oct 2025
To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable Reinforcement Learning
To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable Reinforcement Learning
Yuda Song
Dhruv Rohatgi
Aarti Singh
J. Andrew Bagnell
195
2
0
03 Oct 2025
SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer
SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer
Yarden As
Chengrui Qu
Benjamin Unger
Dongho Kang
Max van der Hart
Laixi Shi
Stelian Coros
Adam Wierman
Andreas Krause
OffRL
420
2
0
23 Sep 2025
Statistical Guarantees for Offline Domain Randomization
Statistical Guarantees for Offline Domain Randomization
Arnaud Fickinger
Abderrahim Bendahi
Stuart J. Russell
OffRL
342
0
0
11 Jun 2025
Near-Optimal Clustering in Mixture of Markov Chains
Near-Optimal Clustering in Mixture of Markov Chains
Junghyun Lee
Yassir Jedra
Alexandre Proutière
Se-Young Yun
336
2
0
02 Jun 2025
Situationally-Aware Dynamics Learning
Situationally-Aware Dynamics Learning
Alejandro Murillo-Gonzalez
Lantao Liu
390
0
0
26 May 2025
Model-based controller assisted domain randomization for transient vibration suppression of nonlinear powertrain system with parametric uncertainty
Model-based controller assisted domain randomization for transient vibration suppression of nonlinear powertrain system with parametric uncertainty
Heisei Yonezawa
Ansei Yonezawa
Itsuro Kajiwara
393
0
0
28 Apr 2025
Improving Controller Generalization with Dimensionless Markov Decision Processes
Improving Controller Generalization with Dimensionless Markov Decision Processes
V. Charvet
Sebastian Stein
R. Murray-Smith
357
2
0
14 Apr 2025
A Classification View on Meta Learning Bandits
A Classification View on Meta Learning Bandits
Mirco Mutti
Jeongyeol Kwon
Shie Mannor
Aviv Tamar
301
0
0
06 Apr 2025
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPsInternational Conference on Learning Representations (ICLR), 2025
Yuheng Zhang
Nan Jiang
OffRL
302
5
0
03 Mar 2025
Learning to Cooperate with Humans using Generative Agents
Learning to Cooperate with Humans using Generative AgentsNeural Information Processing Systems (NeurIPS), 2024
Yancheng Liang
Daphne Chen
Abhishek Gupta
S. Du
Natasha Jaques
SyDa
327
18
0
21 Nov 2024
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from
  Shifted-Dynamics Data
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics DataInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRLOnRL
307
6
0
06 Nov 2024
Learning in Markov Games with Adaptive Adversaries: Policy Regret,
  Fundamental Barriers, and Efficient Algorithms
Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient AlgorithmsNeural Information Processing Systems (NeurIPS), 2024
Thanh Nguyen-Tang
Raman Arora
444
1
0
01 Nov 2024
Test-Time Regret Minimization in Meta Reinforcement Learning
Test-Time Regret Minimization in Meta Reinforcement Learning
Mirco Mutti
Aviv Tamar
327
4
0
04 Jun 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy
  Evaluation
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
Jeongyeol Kwon
Shie Mannor
Constantine Caramanis
Yonathan Efroni
OffRL
444
6
0
03 Jun 2024
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
Vanshaj Khattar
Yuhao Ding
Bilgehan Sel
Javad Lavaei
Ming Jin
OffRL
309
23
0
26 May 2024
Pausing Policy Learning in Non-stationary Reinforcement Learning
Pausing Policy Learning in Non-stationary Reinforcement Learning
Hyunin Lee
Ming Jin
Javad Lavaei
Somayeh Sojoudi
OffRL
256
3
0
25 May 2024
Preparing for Black Swans: The Antifragility Imperative for Machine
  Learning
Preparing for Black Swans: The Antifragility Imperative for Machine Learning
Ming Jin
353
6
0
18 May 2024
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement
  Learning
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning
Anthony Liang
Guy Tennenholtz
Chih-Wei Hsu
Yinlam Chow
Erdem Biyik
Craig Boutilier
OffRL
279
3
0
25 Feb 2024
On the Curses of Future and History in Future-dependent Value Functions
  for Off-policy Evaluation
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
325
7
0
22 Feb 2024
Weakly Coupled Deep Q-Networks
Weakly Coupled Deep Q-NetworksNeural Information Processing Systems (NeurIPS), 2023
Ibrahim El Shar
Daniel R. Jiang
258
8
0
28 Oct 2023
Prospective Side Information for Latent MDPs
Prospective Side Information for Latent MDPsInternational Conference on Machine Learning (ICML), 2023
Jeongyeol Kwon
Yonathan Efroni
Shie Mannor
Constantine Caramanis
383
7
0
11 Oct 2023
Tempo Adaptation in Non-stationary Reinforcement Learning
Tempo Adaptation in Non-stationary Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Hyunin Lee
Yuhao Ding
Jongmin Lee
Ming Jin
Javad Lavaei
Somayeh Sojoudi
299
5
0
26 Sep 2023
JoinGym: An Efficient Query Optimization Environment for Reinforcement
  Learning
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning
Kaiwen Wang
Junxiong Wang
Yueying Li
Nathan Kallus
Immanuel Trummer
Wen Sun
GP
437
3
0
21 Jul 2023
Sample-Efficient Learning of POMDPs with Multiple Observations In
  Hindsight
Sample-Efficient Learning of POMDPs with Multiple Observations In HindsightInternational Conference on Learning Representations (ICLR), 2023
Jiacheng Guo
Minshuo Chen
Haiquan Wang
Caiming Xiong
Mengdi Wang
Yu Bai
298
6
0
06 Jul 2023
Provably Efficient UCB-type Algorithms For Learning Predictive State
  Representations
Provably Efficient UCB-type Algorithms For Learning Predictive State RepresentationsInternational Conference on Learning Representations (ICLR), 2023
Ruiquan Huang
Yitao Liang
J. Yang
OffRL
427
6
0
01 Jul 2023
Context-lumpable stochastic bandits
Context-lumpable stochastic banditsNeural Information Processing Systems (NeurIPS), 2023
Chung-Wei Lee
Qinghua Liu
Yasin Abbasi-Yadkori
Chi Jin
Tor Lattimore
Csaba Szepesvári
OffRL
374
2
0
22 Jun 2023
Provably Efficient Offline Reinforcement Learning with Perturbed Data
  Sources
Provably Efficient Offline Reinforcement Learning with Perturbed Data SourcesInternational Conference on Machine Learning (ICML), 2023
Chengshuai Shi
Wei Xiong
Cong Shen
Jing Yang
OffRL
274
5
0
14 Jun 2023
Near-Optimal Partially Observable Reinforcement Learning with Partial Online State Information
Near-Optimal Partially Observable Reinforcement Learning with Partial Online State Information
Ming Shi
Yingbin Liang
Ness B. Shroff
404
3
0
14 Jun 2023
Representations and Exploration for Deep Reinforcement Learning using
  Singular Value Decomposition
Representations and Exploration for Deep Reinforcement Learning using Singular Value DecompositionInternational Conference on Machine Learning (ICML), 2023
Yash Chandak
S. Thakoor
Z. Guo
Yunhao Tang
Rémi Munos
Will Dabney
Diana Borsa
348
6
0
01 May 2023
Hardness of Independent Learning and Sparse Equilibrium Computation in
  Markov Games
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov GamesInternational Conference on Machine Learning (ICML), 2023
Dylan J. Foster
Noah Golowich
Sham Kakade
281
13
0
22 Mar 2023
POPGym: Benchmarking Partially Observable Reinforcement Learning
POPGym: Benchmarking Partially Observable Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Steven D. Morad
Ryan Kortvelesy
Matteo Bettini
Stephan Liwicki
Amanda Prorok
OffRL
279
57
0
03 Mar 2023
Reinforcement Learning with History-Dependent Dynamic Contexts
Reinforcement Learning with History-Dependent Dynamic ContextsInternational Conference on Machine Learning (ICML), 2023
Guy Tennenholtz
Nadav Merlis
Lior Shani
Martin Mladenov
Craig Boutilier
AI4CE
294
13
0
04 Feb 2023
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Learning in POMDPs is Sample-Efficient with Hindsight ObservabilityInternational Conference on Machine Learning (ICML), 2023
Jonathan Lee
Alekh Agarwal
Christoph Dann
Tong Zhang
362
25
0
31 Jan 2023
Adversarial Online Multi-Task Reinforcement Learning
Adversarial Online Multi-Task Reinforcement LearningInternational Conference on Algorithmic Learning Theory (ALT), 2023
Quan Nguyen
Nishant A. Mehta
207
1
0
11 Jan 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
An Instrumental Variable Approach to Confounded Off-Policy EvaluationInternational Conference on Machine Learning (ICML), 2022
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
OffRL
361
24
0
29 Dec 2022
Offline Policy Evaluation and Optimization under Confounding
Offline Policy Evaluation and Optimization under ConfoundingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Chinmaya Kausik
Yangyi Lu
Kevin Tan
Maggie Makar
Yixin Wang
Ambuj Tewari
OffRL
426
15
0
29 Nov 2022
Learning Mixtures of Markov Chains and MDPs
Learning Mixtures of Markov Chains and MDPsInternational Conference on Machine Learning (ICML), 2022
Chinmaya Kausik
Kevin Tan
Ambuj Tewari
326
13
0
17 Nov 2022
Group Distributionally Robust Reinforcement Learning with Hierarchical
  Latent Variables
Group Distributionally Robust Reinforcement Learning with Hierarchical Latent VariablesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Mengdi Xu
Peide Huang
Yaru Niu
Visak C. V. Kumar
Jielin Qiu
...
Kuan-Hui Lee
Xuewei Qi
Henry Lam
Yue Liu
Ding Zhao
OOD
273
10
0
21 Oct 2022
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent
  Markov Decision Processes
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision ProcessesInternational Conference on Machine Learning (ICML), 2022
Runlong Zhou
Ruosong Wang
S. Du
407
3
0
20 Oct 2022
Tractable Optimality in Episodic Latent MABs
Tractable Optimality in Episodic Latent MABsNeural Information Processing Systems (NeurIPS), 2022
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
321
3
0
05 Oct 2022
Reward-Mixing MDPs with a Few Latent Contexts are Learnable
Reward-Mixing MDPs with a Few Latent Contexts are Learnable
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
213
5
0
05 Oct 2022
Partially Observable RL with B-Stability: Unified Structural Condition
  and Sharp Sample-Efficient Algorithms
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient AlgorithmsInternational Conference on Learning Representations (ICLR), 2022
Fan Chen
Yu Bai
Song Mei
345
25
0
29 Sep 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPsNeural Information Processing Systems (NeurIPS), 2022
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
507
25
0
26 Jul 2022
PAC Reinforcement Learning for Predictive State Representations
PAC Reinforcement Learning for Predictive State RepresentationsInternational Conference on Learning Representations (ICLR), 2022
Wenhao Zhan
Masatoshi Uehara
Wen Sun
Jason D. Lee
542
46
0
12 Jul 2022
On the Complexity of Adversarial Decision Making
On the Complexity of Adversarial Decision MakingNeural Information Processing Systems (NeurIPS), 2022
Dylan J. Foster
Alexander Rakhlin
Ayush Sekhari
Karthik Sridharan
AAML
286
32
0
27 Jun 2022
Computationally Efficient PAC RL in POMDPs with Latent Determinism and
  Conditional Embeddings
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional EmbeddingsInternational Conference on Machine Learning (ICML), 2022
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
268
9
0
24 Jun 2022
Provably Efficient Reinforcement Learning in Partially Observable
  Dynamical Systems
Provably Efficient Reinforcement Learning in Partially Observable Dynamical SystemsNeural Information Processing Systems (NeurIPS), 2022
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
OffRL
318
44
0
24 Jun 2022
12
Next
Page 1 of 2