ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04117
  4. Cited By
The best of both worlds: stochastic and adversarial episodic MDPs with
  unknown transition

The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition

8 June 2021
Tiancheng Jin
Longbo Huang
Haipeng Luo
ArXivPDFHTML

Papers citing "The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition"

8 / 8 papers shown
Title
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
D. Tiapkin
Evgenii Chzhen
Gilles Stoltz
74
0
0
08 Jul 2024
A Simple and Adaptive Learning Rate for FTRL in Online Learning with
  Minimax Regret of $Θ(T^{2/3})$ and its Application to
  Best-of-Both-Worlds
A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of Θ(T2/3)Θ(T^{2/3})Θ(T2/3) and its Application to Best-of-Both-Worlds
Taira Tsuchiya
Shinji Ito
26
0
0
30 May 2024
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
Christoph Dann
Chen-Yu Wei
Julian Zimmert
17
22
0
20 Feb 2023
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear
  Bandit Algorithms
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms
Osama A. Hanna
Lin F. Yang
Christina Fragouli
19
11
0
08 Nov 2022
Adversarially Robust Multi-Armed Bandit Algorithm with
  Variance-Dependent Regret Bounds
Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds
Shinji Ito
Taira Tsuchiya
Junya Honda
AAML
13
16
0
14 Jun 2022
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with
  Feedback Graphs
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs
Chloé Rouyer
Dirk van der Hoeven
Nicolò Cesa-Bianchi
Yevgeny Seldin
21
15
0
01 Jun 2022
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed
  Bandits
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
Jiatai Huang
Yan Dai
Longbo Huang
12
14
0
28 Jan 2022
On Optimal Robustness to Adversarial Corruption in Online Decision
  Problems
On Optimal Robustness to Adversarial Corruption in Online Decision Problems
Shinji Ito
42
22
0
22 Sep 2021
1