ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1301.2609
  4. Cited By
Learning to Optimize Via Posterior Sampling

Learning to Optimize Via Posterior Sampling

11 January 2013
Daniel Russo
Benjamin Van Roy
ArXivPDFHTML

Papers citing "Learning to Optimize Via Posterior Sampling"

26 / 26 papers shown
Title
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
Yun Qu
Wenjie Wang
Yixiu Mao
Yiqin Lv
Xiangyang Ji
TTA
112
0
0
27 Apr 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Jianyu Xu
Qiuzhuang Sun
Yang Yang
Huadong Mo
Daoyi Dong
149
0
0
24 Feb 2025
Improved Regret Analysis in Gaussian Process Bandits: Optimality for Noiseless Reward, RKHS norm, and Non-Stationary Variance
S. Iwazaki
Shion Takeno
111
2
0
10 Feb 2025
Causal Discovery via Bayesian Optimization
Bao Duong
Sunil Gupta
Thin Nguyen
81
0
0
28 Jan 2025
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem
Nima Akbarzadeh
Erick Delage
Yossiri Adulyasak
100
0
0
30 Oct 2024
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
Taiwo A. Adebiyi
Bach Do
Ruda Zhang
152
2
0
29 Oct 2024
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit
Junyu Cao
Ruijiang Gao
Esmaeil Keyvanshokooh
124
1
0
18 Oct 2024
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
S. Samsonov
Eric Moulines
Qi-Man Shao
Zhuo-Song Zhang
Alexey Naumov
63
5
0
26 May 2024
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
Ruitao Chen
Liwei Wang
91
1
0
18 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
90
5
0
22 Feb 2024
Multi-objective optimisation via the R2 utilities
Multi-objective optimisation via the R2 utilities
Ben Tu
N. Kantas
Robert M. Lee
B. Shafei
323
3
0
19 May 2023
Truncated LinUCB for Stochastic Linear Bandits
Truncated LinUCB for Stochastic Linear Bandits
Yanglei Song
Meng zhou
140
0
0
23 Feb 2022
Safe Linear Thompson Sampling with Side Information
Safe Linear Thompson Sampling with Side Information
Ahmadreza Moradipari
Sanae Amani
M. Alizadeh
Christos Thrampoulidis
88
43
0
06 Nov 2019
Generalized Thompson Sampling for Contextual Bandits
Generalized Thompson Sampling for Contextual Bandits
Lihong Li
43
23
0
27 Oct 2013
Thompson Sampling for 1-Dimensional Exponential Family Bandits
Thompson Sampling for 1-Dimensional Exponential Family Bandits
N. Korda
E. Kaufmann
Rémi Munos
51
155
0
12 Jul 2013
Prior-free and prior-dependent regret bounds for Thompson Sampling
Prior-free and prior-dependent regret bounds for Thompson Sampling
Sébastien Bubeck
Che-Yu Liu
71
94
0
21 Apr 2013
Linear Bandits in High Dimension and Recommendation Systems
Linear Bandits in High Dimension and Recommendation Systems
Y. Deshpande
Andrea Montanari
OffRL
54
71
0
08 Jan 2013
Kullback-Leibler upper confidence bounds for optimal sequential
  allocation
Kullback-Leibler upper confidence bounds for optimal sequential allocation
Olivier Cappé
Aurélien Garivier
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
86
394
0
03 Oct 2012
Further Optimal Regret Bounds for Thompson Sampling
Further Optimal Regret Bounds for Thompson Sampling
Shipra Agrawal
Navin Goyal
92
443
0
15 Sep 2012
Thompson Sampling for Contextual Bandits with Linear Payoffs
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
133
993
0
15 Sep 2012
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
102
585
0
18 May 2012
Contextual Bandit Algorithms with Supervised Learning Guarantees
Contextual Bandit Algorithms with Supervised Learning Guarantees
A. Beygelzimer
John Langford
Lihong Li
L. Reyzin
Robert Schapire
OffRL
128
324
0
22 Feb 2010
X-Armed Bandits
X-Armed Bandits
Sébastien Bubeck
Rémi Munos
Gilles Stoltz
Csaba Szepesvari
123
383
0
25 Jan 2010
Gaussian Process Optimization in the Bandit Setting: No Regret and
  Experimental Design
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
Niranjan Srinivas
Andreas Krause
Sham Kakade
Matthias Seeger
131
1,616
0
21 Dec 2009
Linearly Parameterized Bandits
Linearly Parameterized Bandits
Paat Rusmevichientong
J. Tsitsiklis
206
558
0
18 Dec 2008
Multi-Armed Bandits in Metric Spaces
Multi-Armed Bandits in Metric Spaces
Robert D. Kleinberg
Aleksandrs Slivkins
E. Upfal
212
468
0
29 Sep 2008
1