Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

8 February 2021

Papers citing "Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature"

5 / 5 papers shown

Title
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning Subhojyoti Mukherjee Josiah P. Hanna Qiaomin Xie Robert Nowak 61 2 0 07 Jun 2024
Online Learning in Stackelberg Games with an Omniscient Follower Geng Zhao Banghua Zhu Jiantao Jiao Michael I. Jordan 33 14 0 27 Jan 2023
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Gergely Neu Julia Olkhovskaya Matteo Papini Ludovic Schwartz 25 16 0 27 May 2022
Optimism in Reinforcement Learning with Generalized Linear Function Approximation Yining Wang Ruosong Wang S. Du A. Krishnamurthy 127 135 0 09 Dec 2019
Input Convex Neural Networks Brandon Amos Lei Xu J. Zico Kolter 173 596 0 22 Sep 2016