Mirror Descent and the Information Ratio

25 September 2020

Papers citing "Mirror Descent and the Information Ratio"

36 / 36 papers shown

Title
Non-stationary Bandit Convex Optimization: A Comprehensive Study Xiaoqi Liu Dorian Baudry Julian Zimmert Patrick Rebeschini Arya Akhavan 74 0 0 03 Jun 2025
On the Problem of Best Arm Retention Houshuang Chen Yuchen He Chihao Zhang 81 0 0 16 Apr 2025
One Set to Rule Them All: How to Obtain General Chemical Conditions via Bayesian Optimization over Curried Functions Stefan P. Schmid Ella M. Rajaonson C. Ser Mohammad Haddadnia Shi Xuan Leong Alán Aspuru-Guzik Agustinus Kristiadi Kjell Jorner Felix Strieth-Kalthoff 112 0 0 26 Feb 2025
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability Fan Chen Dylan J. Foster Yanjun Han Jian Qian Alexander Rakhlin Yunbei Xu 87 2 0 07 Oct 2024
Understanding Memory-Regret Trade-Off for Streaming Stochastic Multi-Armed Bandits Yuchen He Zichun Ye Chihao Zhang 94 3 0 30 May 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off Itai Shufaro Nadav Merlis Nir Weinberger Shie Mannor 199 0 0 26 May 2024
Regret Minimization via Saddle Point Optimization Johannes Kirschner Seyed Alireza Bakhtiari Kushagra Chandak Volodymyr Tkachuk Csaba Szepesvári 65 1 0 15 Mar 2024
Optimistic Information Directed Sampling Gergely Neu Matteo Papini Ludovic Schwartz 123 2 0 23 Feb 2024
Exploration by Optimization with Hybrid Regularizers: Logarithmic Regret with Adversarial Robustness in Partial Monitoring Taira Tsuchiya Shinji Ito Junya Honda 52 1 0 13 Feb 2024
Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning Ahmadreza Moradipari M. Pedramfar Modjtaba Shokrian Zini Vaneet Aggarwal 65 5 0 30 Oct 2023
Optimal Exploration is no harder than Thompson Sampling Zhaoqi Li Kevin Jamieson Lalit P. Jain 69 3 0 09 Oct 2023
Bayesian Design Principles for Frequentist Sequential Learning Yunbei Xu A. Zeevi 117 13 0 01 Oct 2023
On the Minimax Regret in Online Ranking with Top-k Feedback Mingyuan Zhang Ambuj Tewari 59 0 0 05 Sep 2023
Incentivizing Exploration with Linear Contexts and Combinatorial Actions Mark Sellke 62 4 0 03 Jun 2023
Synaptic Weight Distributions Depend on the Geometry of Plasticity Roman Pogodin Jonathan H. Cornford Arna Ghosh Gauthier Gidel Guillaume Lajoie Blake A. Richards 61 5 0 30 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load Dilip Arumugam Mark K. Ho Noah D. Goodman Benjamin Van Roy OffRL 86 8 0 05 May 2023
Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits Nived Rajaraman Yanjun Han Jiantao Jiao Kannan Ramchandran 94 2 0 12 Feb 2023
An Information-Theoretic Analysis of Nonstationary Bandit Learning Seungki Min Daniel Russo 89 7 0 09 Feb 2023
Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications Johannes Kirschner Tor Lattimore Andreas Krause 95 8 0 07 Feb 2023
On the Complexity of Adversarial Decision Making Dylan J. Foster Alexander Rakhlin Ayush Sekhari Karthik Sridharan AAML 79 29 0 27 Jun 2022
Regret Bounds for Information-Directed Reinforcement Learning Botao Hao Tor Lattimore OffRL 106 19 0 09 Jun 2022
Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning Dilip Arumugam Benjamin Van Roy OffRL 78 15 0 04 Jun 2022
Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition Yuchen He Chihao Zhang 24 1 0 30 May 2022
Contextual Information-Directed Sampling Botao Hao Tor Lattimore Chao Qin 95 14 0 22 May 2022
Worst-case Performance of Greedy Policies in Bandits with Imperfect Context Observations Hongju Park Mohamad Kazem Shirani Faradonbeh OffRL 66 2 0 10 Apr 2022
Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret Tor Lattimore 52 16 0 22 Feb 2022
A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit Vladimir A. Kobzar R. Kohn 69 4 0 11 Feb 2022
Efficient Algorithms for Learning to Control Bandits with Unobserved Contexts Hongju Park Mohamad Kazem Shirani Faradonbeh 43 6 0 02 Feb 2022
Gaussian Imagination in Bandit Learning Yueyang Liu Adithya M. Devraj Benjamin Van Roy Kuang Xu 103 7 0 06 Jan 2022
The Value of Information When Deciding What to Learn Dilip Arumugam Benjamin Van Roy 70 12 0 26 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning Tong Zhang 87 65 0 02 Oct 2021
Minimax Regret for Bandit Convex Optimisation of Ridge Functions Tor Lattimore 52 3 0 01 Jun 2021
Information Directed Sampling for Sparse Linear Bandits Botao Hao Tor Lattimore Wei Deng 62 19 0 29 May 2021
Reinforcement Learning, Bit by Bit Xiuyuan Lu Benjamin Van Roy Vikranth Dwaracherla M. Ibrahimi Ian Osband Zheng Wen 126 70 0 06 Mar 2021
A Bit Better? Quantifying Information for Bandit Learning Adithya M. Devraj Benjamin Van Roy Kuang Xu 50 5 0 18 Feb 2021
First-Order Bayesian Regret Analysis of Thompson Sampling Sébastien Bubeck Mark Sellke 91 17 0 02 Feb 2019