Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.07623
Cited By
v1
v2
v3
v4
v5
v6 (latest)
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits
19 July 2018
Julian Zimmert
Yevgeny Seldin
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits"
50 / 56 papers shown
Title
Adversarial Bandit over Bandits: Hierarchical Bandits for Online Configuration Management
C. Avin
Zvi Lotker
Shie Mannor
G. Shabat
H. Shteingart
Roey Yadgar
151
0
0
25 May 2025
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries
Arnab Maiti
Zhiyuan Fan
Kevin Jamieson
Lillian J. Ratliff
Gabriele Farina
521
1
0
01 Apr 2025
Revisiting Online Learning Approach to Inverse Linear Optimization: A Fenchel
−
-
−
Young Loss Perspective and Gap-Dependent Regret Analysis
Shinsaku Sakaue
Han Bao
Taira Tsuchiya
201
2
0
23 Jan 2025
Offline-to-online hyperparameter transfer for stochastic bandits
Dravyansh Sharma
Arun Sai Suggala
OffRL
103
4
0
06 Jan 2025
A Model Selection Approach for Corruption Robust Reinforcement Learning
Chen-Yu Wei
Christoph Dann
Julian Zimmert
193
45
0
31 Dec 2024
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Quan Nguyen
Nishant A. Mehta
Cristóbal Guzmán
236
2
0
01 Oct 2024
Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits
Mengmeng Li
Daniel Kuhn
Bahar Taşkesen
127
0
0
30 Sep 2024
Stochastic Bandits Robust to Adversarial Attacks
Xuchuang Wang
Jinhang Zuo
Xutong Liu
John C. S. Lui
Mohammad Hajiesmaili
AAML
28
0
0
16 Aug 2024
Bellman Diffusion Models
Liam Schramm
Abdeslam Boularias
DiffM
126
2
0
16 Jul 2024
Towards the Transferability of Rewards Recovered via Regularized Inverse Reinforcement Learning
Andreas Schlaginhaufen
Maryam Kamgarpour
OffRL
54
3
0
03 Jun 2024
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits
Masahiro Kato
Shinji Ito
189
0
0
05 Mar 2024
Adaptive Experimental Design for Policy Learning
Masahiro Kato
Kyohei Okumura
Takuya Ishihara
Toru Kitagawa
OffRL
88
0
0
08 Jan 2024
Little Exploration is All You Need
Henry H.H. Chen
Jiaming Lu
21
0
0
26 Oct 2023
CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption
Shubhada Agrawal
Timothée Mathieu
D. Basu
Odalric-Ambrym Maillard
59
3
0
28 Sep 2023
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards
Hao Qin
Kwang-Sung Jun
Chicheng Zhang
88
1
0
28 Apr 2023
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
Christoph Dann
Chen-Yu Wei
Julian Zimmert
73
24
0
20 Feb 2023
Learning in quantum games
Kyriakos Lotidis
P. Mertikopoulos
Nicholas Bambos
66
7
0
05 Feb 2023
Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds
Johan Ostman
Ather Gattami
D. Gillblad
74
1
0
27 Jan 2023
Adapting to game trees in zero-sum imperfect information games
Côme Fiegel
Pierre Ménard
Tadashi Kozuno
Rémi Munos
Vianney Perchet
Michal Valko
383
10
0
23 Dec 2022
Pareto Regret Analyses in Multi-objective Multi-armed Bandit
Mengfan Xu
Diego Klabjan
67
9
0
01 Dec 2022
On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits
Jialin Yi
Milan Vojnović
65
3
0
30 Nov 2022
Anytime-valid off-policy inference for contextual bandits
Ian Waudby-Smith
Lili Wu
Aaditya Ramdas
Nikos Karampatziakis
Paul Mineiro
OffRL
119
30
0
19 Oct 2022
Learning in Stackelberg Games with Non-myopic Agents
Nika Haghtalab
Thodoris Lykouris
Sloan Nietert
Alexander Wei
180
32
0
19 Aug 2022
Best-of-Both-Worlds Algorithms for Partial Monitoring
Taira Tsuchiya
Shinji Ito
Junya Honda
52
16
0
29 Jul 2022
Best of Both Worlds Model Selection
Aldo Pacchiano
Christoph Dann
Claudio Gentile
81
10
0
29 Jun 2022
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
71
20
0
29 Jun 2022
Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback
Fang-yuan Kong
Yichi Zhou
Shuai Li
60
8
0
16 Jun 2022
Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds
Shinji Ito
Taira Tsuchiya
Junya Honda
AAML
43
17
0
14 Jun 2022
Better Best of Both Worlds Bounds for Bandits with Switching Costs
I Zaghloul Amir
Guy Azov
Tomer Koren
Roi Livni
63
16
0
07 Jun 2022
A Regret-Variance Trade-Off in Online Learning
Dirk van der Hoeven
Nikita Zhivotovskiy
Nicolò Cesa-Bianchi
50
7
0
06 Jun 2022
Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs
Shinji Ito
Taira Tsuchiya
Junya Honda
78
24
0
02 Jun 2022
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs
Chloé Rouyer
Dirk van der Hoeven
Nicolò Cesa-Bianchi
Yevgeny Seldin
90
17
0
01 Jun 2022
Between Stochastic and Adversarial Online Convex Optimization: Improved Regret Bounds via Smoothness
Sarah Sachs
Hédi Hadiji
T. Erven
Cristóbal Guzmán
141
17
0
15 Feb 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences
Aadirupa Saha
Pierre Gaillard
71
7
0
14 Feb 2022
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination
Arpan Mukherjee
A. Tajer
Pin-Yu Chen
Payel Das
AAML
FedML
59
9
0
14 Nov 2021
When Are Linear Stochastic Bandits Attackable?
Huazheng Wang
Haifeng Xu
Hongning Wang
AAML
83
11
0
18 Oct 2021
On Optimal Robustness to Adversarial Corruption in Online Decision Problems
Shinji Ito
77
22
0
22 Sep 2021
Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits
Junpei Komiyama
Edouard Fouché
Junya Honda
81
6
0
23 Jul 2021
Bayesian decision-making under misspecified priors with applications to meta-learning
Max Simchowitz
Christopher Tosh
A. Krishnamurthy
Daniel J. Hsu
Thodoris Lykouris
Miroslav Dudík
Robert Schapire
104
50
0
03 Jul 2021
Cooperative Stochastic Multi-agent Multi-armed Bandits Robust to Adversarial Corruptions
Junyan Liu
Shuai Li
Dapeng Li
58
6
0
08 Jun 2021
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
Tiancheng Jin
Longbo Huang
Haipeng Luo
84
42
0
08 Jun 2021
Improved Analysis of the Tsallis-INF Algorithm in Stochastically Constrained Adversarial Bandits and Stochastic Bandits with Adversarial Corruptions
Saeed Masoudian
Yevgeny Seldin
46
15
0
23 Mar 2021
An Algorithm for Stochastic and Adversarial Bandits with Switching Costs
Chloé Rouyer
Yevgeny Seldin
Nicolò Cesa-Bianchi
AAML
46
25
0
19 Feb 2021
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
Mengxiao Zhang
Xiaojin Zhang
96
49
0
11 Feb 2021
Mirror Descent and the Information Ratio
Tor Lattimore
András Gyorgy
72
42
0
25 Sep 2020
Convex Regularization in Monte-Carlo Tree Search
Tuan Dam
Carlo DÉramo
Jan Peters
Joni Pajarinen
OffRL
57
11
0
01 Jul 2020
Corralling Stochastic Bandit Algorithms
R. Arora
T. V. Marinov
M. Mohri
115
35
0
16 Jun 2020
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition
Tiancheng Jin
Haipeng Luo
102
57
0
10 Jun 2020
Bandits with adversarial scaling
Thodoris Lykouris
Vahab Mirrokni
R. Leme
80
14
0
04 Mar 2020
Contextual Search in the Presence of Adversarial Corruptions
A. Krishnamurthy
Thodoris Lykouris
Chara Podimata
Robert Schapire
106
4
0
26 Feb 2020
1
2
Next