ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.03065
  4. Cited By
Warm-up Free Policy Optimization: Improved Regret in Linear Markov
  Decision Processes

Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes

3 July 2024
Asaf B. Cassel
Aviv A. Rosenberg
ArXivPDFHTML

Papers citing "Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes"

6 / 6 papers shown
Title
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
D. Tiapkin
Evgenii Chzhen
Gilles Stoltz
74
0
0
08 Jul 2024
Computationally Efficient Horizon-Free Reinforcement Learning for Linear
  Mixture MDPs
Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs
Dongruo Zhou
Quanquan Gu
73
43
0
23 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
First-Order Regret in Reinforcement Learning with Linear Function
  Approximation: A Robust Estimation Approach
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
71
36
0
07 Dec 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial
  Linear Mixture MDPs
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
90
23
0
17 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear
  Mixture MDP
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
57
36
0
29 Jan 2021
1