Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.03065
Cited By
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
3 July 2024
Asaf B. Cassel
Aviv A. Rosenberg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes"
6 / 6 papers shown
Title
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
D. Tiapkin
Evgenii Chzhen
Gilles Stoltz
74
0
0
08 Jul 2024
Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs
Dongruo Zhou
Quanquan Gu
73
43
0
23 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
71
36
0
07 Dec 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
90
23
0
17 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
57
36
0
29 Jan 2021
1