Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.04939
Cited By
RL for Latent MDPs: Regret Guarantees and a Lower Bound
9 February 2021
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RL for Latent MDPs: Regret Guarantees and a Lower Bound"
26 / 26 papers shown
Title
Situationally-Aware Dynamics Learning
Alejandro Murillo-Gonzalez
Lantao Liu
48
0
0
26 May 2025
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Chi Jin
Sham Kakade
A. Krishnamurthy
Qinghua Liu
68
65
0
22 Jun 2020
On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression
Jeongyeol Kwon
Nhat Ho
Constantine Caramanis
21
39
0
04 Jun 2020
EM Converges for a Mixture of Many Linear Regressions
Jeongyeol Kwon
Constantine Caramanis
22
38
0
28 May 2019
Provably efficient RL with Rich Observations via Latent State Decoding
S. Du
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
Miroslav Dudík
John Langford
OffRL
33
230
0
25 Jan 2019
On Oracle-Efficient PAC RL with Rich Observations
Christoph Dann
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
27
98
0
01 Mar 2018
Markov Decision Processes with Continuous Side Information
Aditya Modi
Nan Jiang
Satinder Singh
Ambuj Tewari
OffRL
32
61
0
15 Nov 2017
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
60
768
0
16 Mar 2017
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
71
417
0
29 Oct 2016
On Context-Dependent Clustering of Bandits
Claudio Gentile
Shuai Li
Purushottam Kar
Alexandros Karatzoglou
Evans Etrue
Giovanni Zappella
34
138
0
06 Aug 2016
A PAC RL Algorithm for Episodic POMDPs
Z. Guo
Shayan Doroudi
Emma Brunskill
63
56
0
25 May 2016
Reinforcement Learning of POMDPs using Spectral Methods
Kamyar Azizzadenesheli
A. Lazaric
Anima Anandkumar
22
127
0
25 Feb 2016
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems
Aurélien Garivier
Pierre Ménard
Gilles Stoltz
40
211
0
23 Feb 2016
Contextual Markov Decision Processes
Assaf Hallak
Dotan Di Castro
Shie Mannor
54
243
0
08 Feb 2015
Online Clustering of Bandits
Claudio Gentile
Shuai Li
Giovanni Zappella
47
264
0
31 Jan 2014
Sample Complexity of Multi-task Reinforcement Learning
Emma Brunskill
Lihong Li
45
138
0
26 Sep 2013
PEGASUS: A Policy Search Method for Large MDPs and POMDPs
A. Ng
Michael I. Jordan
42
496
0
16 Jan 2013
Tensor decompositions for learning latent variable models
Anima Anandkumar
Rong Ge
Daniel J. Hsu
Sham Kakade
Matus Telgarsky
222
1,142
0
29 Oct 2012
Predictive State Representations: A New Theory for Modeling Dynamical Systems
Satinder Singh
Michael R. James
Matthew R. Rudary
AI4TS
AI4CE
48
288
0
11 Jul 2012
Heuristic Search Value Iteration for POMDPs
Trey Smith
R. Simmons
67
545
0
11 Jul 2012
A Method of Moments for Mixture Models and Hidden Markov Models
Anima Anandkumar
Daniel J. Hsu
Sham Kakade
93
341
0
03 Mar 2012
Anytime Point-Based Approximations for Large POMDPs
Joelle Pineau
Geoffrey J. Gordon
Sebastian Thrun
61
418
0
30 Sep 2011
Perseus: Randomized Point-based Value Iteration for POMDPs
M. Spaan
N. Vlassis
79
766
0
09 Sep 2011
Closing the Learning-Planning Loop with Predictive State Representations
Byron Boots
S. Siddiqi
Geoffrey J. Gordon
177
264
0
12 Dec 2009
A Spectral Algorithm for Learning Hidden Markov Models
Daniel J. Hsu
Sham Kakade
Tong Zhang
106
309
0
26 Nov 2008
Online EM Algorithm for Latent Data Models
Olivier Cappé
Eric Moulines
95
479
0
27 Dec 2007
1