ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.04939
  4. Cited By
RL for Latent MDPs: Regret Guarantees and a Lower Bound

RL for Latent MDPs: Regret Guarantees and a Lower Bound

9 February 2021
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
ArXivPDFHTML

Papers citing "RL for Latent MDPs: Regret Guarantees and a Lower Bound"

26 / 26 papers shown
Title
Situationally-Aware Dynamics Learning
Situationally-Aware Dynamics Learning
Alejandro Murillo-Gonzalez
Lantao Liu
48
0
0
26 May 2025
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Chi Jin
Sham Kakade
A. Krishnamurthy
Qinghua Liu
68
65
0
22 Jun 2020
On the Minimax Optimality of the EM Algorithm for Learning Two-Component
  Mixed Linear Regression
On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression
Jeongyeol Kwon
Nhat Ho
Constantine Caramanis
21
39
0
04 Jun 2020
EM Converges for a Mixture of Many Linear Regressions
EM Converges for a Mixture of Many Linear Regressions
Jeongyeol Kwon
Constantine Caramanis
22
38
0
28 May 2019
Provably efficient RL with Rich Observations via Latent State Decoding
Provably efficient RL with Rich Observations via Latent State Decoding
S. Du
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
Miroslav Dudík
John Langford
OffRL
33
230
0
25 Jan 2019
On Oracle-Efficient PAC RL with Rich Observations
On Oracle-Efficient PAC RL with Rich Observations
Christoph Dann
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
27
98
0
01 Mar 2018
Markov Decision Processes with Continuous Side Information
Markov Decision Processes with Continuous Side Information
Aditya Modi
Nan Jiang
Satinder Singh
Ambuj Tewari
OffRL
32
61
0
15 Nov 2017
Minimax Regret Bounds for Reinforcement Learning
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
60
768
0
16 Mar 2017
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
71
417
0
29 Oct 2016
On Context-Dependent Clustering of Bandits
On Context-Dependent Clustering of Bandits
Claudio Gentile
Shuai Li
Purushottam Kar
Alexandros Karatzoglou
Evans Etrue
Giovanni Zappella
34
138
0
06 Aug 2016
A PAC RL Algorithm for Episodic POMDPs
A PAC RL Algorithm for Episodic POMDPs
Z. Guo
Shayan Doroudi
Emma Brunskill
63
56
0
25 May 2016
Reinforcement Learning of POMDPs using Spectral Methods
Reinforcement Learning of POMDPs using Spectral Methods
Kamyar Azizzadenesheli
A. Lazaric
Anima Anandkumar
22
127
0
25 Feb 2016
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems
Aurélien Garivier
Pierre Ménard
Gilles Stoltz
40
211
0
23 Feb 2016
Contextual Markov Decision Processes
Contextual Markov Decision Processes
Assaf Hallak
Dotan Di Castro
Shie Mannor
54
243
0
08 Feb 2015
Online Clustering of Bandits
Online Clustering of Bandits
Claudio Gentile
Shuai Li
Giovanni Zappella
47
264
0
31 Jan 2014
Sample Complexity of Multi-task Reinforcement Learning
Sample Complexity of Multi-task Reinforcement Learning
Emma Brunskill
Lihong Li
45
138
0
26 Sep 2013
PEGASUS: A Policy Search Method for Large MDPs and POMDPs
PEGASUS: A Policy Search Method for Large MDPs and POMDPs
A. Ng
Michael I. Jordan
42
496
0
16 Jan 2013
Tensor decompositions for learning latent variable models
Tensor decompositions for learning latent variable models
Anima Anandkumar
Rong Ge
Daniel J. Hsu
Sham Kakade
Matus Telgarsky
222
1,142
0
29 Oct 2012
Predictive State Representations: A New Theory for Modeling Dynamical
  Systems
Predictive State Representations: A New Theory for Modeling Dynamical Systems
Satinder Singh
Michael R. James
Matthew R. Rudary
AI4TS
AI4CE
48
288
0
11 Jul 2012
Heuristic Search Value Iteration for POMDPs
Heuristic Search Value Iteration for POMDPs
Trey Smith
R. Simmons
67
545
0
11 Jul 2012
A Method of Moments for Mixture Models and Hidden Markov Models
A Method of Moments for Mixture Models and Hidden Markov Models
Anima Anandkumar
Daniel J. Hsu
Sham Kakade
93
341
0
03 Mar 2012
Anytime Point-Based Approximations for Large POMDPs
Anytime Point-Based Approximations for Large POMDPs
Joelle Pineau
Geoffrey J. Gordon
Sebastian Thrun
61
418
0
30 Sep 2011
Perseus: Randomized Point-based Value Iteration for POMDPs
Perseus: Randomized Point-based Value Iteration for POMDPs
M. Spaan
N. Vlassis
79
766
0
09 Sep 2011
Closing the Learning-Planning Loop with Predictive State Representations
Closing the Learning-Planning Loop with Predictive State Representations
Byron Boots
S. Siddiqi
Geoffrey J. Gordon
177
264
0
12 Dec 2009
A Spectral Algorithm for Learning Hidden Markov Models
A Spectral Algorithm for Learning Hidden Markov Models
Daniel J. Hsu
Sham Kakade
Tong Zhang
106
309
0
26 Nov 2008
Online EM Algorithm for Latent Data Models
Online EM Algorithm for Latent Data Models
Olivier Cappé
Eric Moulines
95
479
0
27 Dec 2007
1