Learning to Optimize Via Posterior Sampling

11 January 2013

Papers citing "Learning to Optimize Via Posterior Sampling"

26 / 26 papers shown

Title
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments Yun Qu Wenjie Wang Yixiu Mao Yiqin Lv Xiangyang Ji TTA 112 0 0 27 Apr 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context Jianyu Xu Qiuzhuang Sun Yang Yang Huadong Mo Daoyi Dong 149 0 0 24 Feb 2025
Improved Regret Analysis in Gaussian Process Bandits: Optimality for Noiseless Reward, RKHS norm, and Non-Stationary Variance S. Iwazaki Shion Takeno 111 2 0 10 Feb 2025
Causal Discovery via Bayesian Optimization Bao Duong Sunil Gupta Thin Nguyen 81 0 0 28 Jan 2025
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem Nima Akbarzadeh Erick Delage Yossiri Adulyasak 100 0 0 30 Oct 2024
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding Taiwo A. Adebiyi Bach Do Ruda Zhang 152 2 0 29 Oct 2024
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit Junyu Cao Ruijiang Gao Esmaeil Keyvanshokooh 124 1 0 18 Oct 2024
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning S. Samsonov Eric Moulines Qi-Man Shao Zhuo-Song Zhang Alexey Naumov 63 5 0 26 May 2024
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback Ruitao Chen Liwei Wang 91 1 0 18 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces Imad Aouali Victor-Emmanuel Brunel David Rohde Anna Korba OffRL 90 5 0 22 Feb 2024
Multi-objective optimisation via the R2 utilities Ben Tu N. Kantas Robert M. Lee B. Shafei 323 3 0 19 May 2023
Truncated LinUCB for Stochastic Linear Bandits Yanglei Song Meng zhou 140 0 0 23 Feb 2022
Safe Linear Thompson Sampling with Side Information Ahmadreza Moradipari Sanae Amani M. Alizadeh Christos Thrampoulidis 88 43 0 06 Nov 2019
Generalized Thompson Sampling for Contextual Bandits Lihong Li 43 23 0 27 Oct 2013
Thompson Sampling for 1-Dimensional Exponential Family Bandits N. Korda E. Kaufmann Rémi Munos 51 155 0 12 Jul 2013
Prior-free and prior-dependent regret bounds for Thompson Sampling Sébastien Bubeck Che-Yu Liu 71 94 0 21 Apr 2013
Linear Bandits in High Dimension and Recommendation Systems Y. Deshpande Andrea Montanari OffRL 54 71 0 08 Jan 2013
Kullback-Leibler upper confidence bounds for optimal sequential allocation Olivier Cappé Aurélien Garivier Odalric-Ambrym Maillard Rémi Munos Gilles Stoltz 86 394 0 03 Oct 2012
Further Optimal Regret Bounds for Thompson Sampling Shipra Agrawal Navin Goyal 92 443 0 15 Sep 2012
Thompson Sampling for Contextual Bandits with Linear Payoffs Shipra Agrawal Navin Goyal 133 993 0 15 Sep 2012
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis E. Kaufmann N. Korda Rémi Munos 102 585 0 18 May 2012
Contextual Bandit Algorithms with Supervised Learning Guarantees A. Beygelzimer John Langford Lihong Li L. Reyzin Robert Schapire OffRL 128 324 0 22 Feb 2010
X-Armed Bandits Sébastien Bubeck Rémi Munos Gilles Stoltz Csaba Szepesvari 123 383 0 25 Jan 2010
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design Niranjan Srinivas Andreas Krause Sham Kakade Matthias Seeger 131 1,616 0 21 Dec 2009
Linearly Parameterized Bandits Paat Rusmevichientong J. Tsitsiklis 206 558 0 18 Dec 2008
Multi-Armed Bandits in Metric Spaces Robert D. Kleinberg Aleksandrs Slivkins E. Upfal 212 468 0 29 Sep 2008