ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.17061
  4. Cited By
Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation

Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation

17 January 2025
Long-Fei Li
Yu-Jie Zhang
Peng Zhao
Zhi-Hua Zhou
ArXivPDFHTML

Papers citing "Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation"

6 / 6 papers shown
Title
Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update
Jing Wang
Yu-Jie Zhang
Peng Zhao
Zhi-Hua Zhou
41
0
0
01 Mar 2025
Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs
Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs
Long-Fei Li
Peng Zhao
Zhi-Hua Zhou
34
0
0
05 Nov 2024
Model-Based Reinforcement Learning with Multinomial Logistic Function
  Approximation
Model-Based Reinforcement Learning with Multinomial Logistic Function Approximation
Taehyun Hwang
Min Hwan Oh
20
8
0
27 Dec 2022
A General Framework for Sample-Efficient Function Approximation in
  Reinforcement Learning
A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Zixiang Chen
C. J. Li
An Yuan
Quanquan Gu
Michael I. Jordan
OffRL
104
26
0
30 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits
Marc Abeille
Louis Faury
Clément Calauzènes
94
32
0
23 Oct 2020
1