ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 0704.1020
  4. Cited By
The on-line shortest path problem under partial monitoring

The on-line shortest path problem under partial monitoring

8 April 2007
Pál Benkö
T. Várady
L. Andor
Ralph Robert Martin
ArXiv (abs)PDFHTML

Papers citing "The on-line shortest path problem under partial monitoring"

49 / 49 papers shown
Title
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries
Arnab Maiti
Zhiyuan Fan
Kevin Jamieson
Lillian J. Ratliff
Gabriele Farina
511
1
0
01 Apr 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Aleksandrs Slivkins
Yunzong Xu
Shiliang Zuo
537
1
0
06 Mar 2025
Adversarial Combinatorial Semi-bandits with Graph Feedback
Adversarial Combinatorial Semi-bandits with Graph Feedback
Yuxiao Wen
136
0
0
26 Feb 2025
Online Combinatorial Linear Optimization via a Frank-Wolfe-based
  Metarounding Algorithm
Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm
Ryotaro Mitsuboshi
Kohei Hatano
Eiji Takimoto
21
0
0
19 Oct 2023
Taming the Exponential Action Set: Sublinear Regret and Fast Convergence
  to Nash Equilibrium in Online Congestion Games
Taming the Exponential Action Set: Sublinear Regret and Fast Convergence to Nash Equilibrium in Online Congestion Games
Jing Dong
Jingyu Wu
Si-Yi Wang
Baoxiang Wang
Wei Chen
76
4
0
19 Jun 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe
  Multi-Agent Reinforcement Learning
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
Mihailo R. Jovanović
OffRL
76
11
0
31 May 2023
Learning and Collusion in Multi-unit Auctions
Learning and Collusion in Multi-unit Auctions
Simina Brânzei
Mahsa Derakhshan
Negin Golrezaei
Yanjun Han
50
3
0
27 May 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial
  Semi-Bandits, Linear Bandits, and MDPs
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
83
6
0
15 May 2023
Fully Dynamic Online Selection through Online Contention Resolution
  Schemes
Fully Dynamic Online Selection through Online Contention Resolution Schemes
Vashist Avadhanula
A. Celli
Riccardo Colini-Baldeschi
S. Leonardi
M. Russo
132
1
0
08 Jan 2023
Online Submodular Coordination with Bounded Tracking Regret: Theory,
  Algorithm, and Applications to Multi-Robot Coordination
Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot Coordination
Zirui Xu
Hongyu Zhou
Vasileios Tzoumas
80
9
0
26 Sep 2022
Nested bandits
Nested bandits
Matthieu Martin
P. Mertikopoulos
Thibaud Rahier
Houssam Zenati
26
2
0
19 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards
  Optimality
Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality
Ming Yin
Wenjing Chen
Mengdi Wang
Yu Wang
OffRL
48
4
0
10 Jun 2022
Individually Fair Learning with One-Sided Feedback
Individually Fair Learning with One-Sided Feedback
Yahav Bechavod
Aaron Roth
FaML
53
3
0
09 Jun 2022
Incentivizing Combinatorial Bandit Exploration
Incentivizing Combinatorial Bandit Exploration
Xinyan Hu
Dung Daniel Ngo
Aleksandrs Slivkins
Zhiwei Steven Wu
44
12
0
01 Jun 2022
Fast online inference for nonlinear contextual bandit based on
  Generative Adversarial Network
Fast online inference for nonlinear contextual bandit based on Generative Adversarial Network
Yun-Da Tsai
Shou-De Lin
75
5
0
17 Feb 2022
Adversarial Online Learning with Variable Plays in the Pursuit-Evasion
  Game: Theoretical Foundations and Application in Connected and Automated
  Vehicle Cybersecurity
Adversarial Online Learning with Variable Plays in the Pursuit-Evasion Game: Theoretical Foundations and Application in Connected and Automated Vehicle Cybersecurity
Yiyang Wang
Neda Masoud
AAML
20
2
0
26 Oct 2021
Reusing Combinatorial Structure: Faster Iterative Projections over
  Submodular Base Polytopes
Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base Polytopes
Jai Moondra
Hassan Mortagy
Swati Gupta
87
4
0
22 Jun 2021
Contextual Recommendations and Low-Regret Cutting-Plane Algorithms
Contextual Recommendations and Low-Regret Cutting-Plane Algorithms
Sreenivas Gollapudi
Guru Guruganesh
Kostas Kollias
Pasin Manurangsi
R. Leme
Jon Schneider
44
5
0
09 Jun 2021
Bandit Linear Optimization for Sequential Decision Making and
  Extensive-Form Games
Bandit Linear Optimization for Sequential Decision Making and Extensive-Form Games
Gabriele Farina
Robin Schmucker
Tuomas Sandholm
165
21
0
08 Mar 2021
Model-Free Online Learning in Unknown Sequential Decision Making
  Problems and Games
Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games
Gabriele Farina
Tuomas Sandholm
OffRL
75
18
0
08 Mar 2021
Bias no more: high-probability data-dependent regret bounds for
  adversarial bandits and MDPs
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
Mengxiao Zhang
184
53
0
14 Jun 2020
Contextual Blocking Bandits
Contextual Blocking Bandits
Soumya Basu
Orestis Papadigenopoulos
Constantine Caramanis
Sanjay Shakkottai
83
21
0
06 Mar 2020
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with
  Adversarial Loss
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss
Shuang Qiu
Xiaohan Wei
Zhuoran Yang
Jieping Ye
Zhaoran Wang
172
50
0
02 Mar 2020
Combinatorial Semi-Bandit in the Non-Stationary Environment
Combinatorial Semi-Bandit in the Non-Stationary Environment
Wei Chen
Liwei Wang
Haoyu Zhao
Kai Zheng
86
18
0
10 Feb 2020
No-Regret Exploration in Goal-Oriented Reinforcement Learning
No-Regret Exploration in Goal-Oriented Reinforcement Learning
Jean Tarbouriech
Evrard Garcelon
Michal Valko
Matteo Pirotta
A. Lazaric
99
46
0
07 Dec 2019
Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple
  Plays
Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays
Nuri Mert Vural
Hakan Gokcesu
Kaan Gokcesu
Suleyman S. Kozat
21
19
0
25 Nov 2019
Blocking Bandits
Blocking Bandits
Soumya Basu
Rajat Sen
Sujay Sanghavi
Sanjay Shakkottai
53
34
0
27 Jul 2019
Exploration by Optimisation in Partial Monitoring
Exploration by Optimisation in Partial Monitoring
Tor Lattimore
Csaba Szepesvári
72
38
0
12 Jul 2019
Path Planning Problems with Side Observations-When Colonels Play
  Hide-and-Seek
Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek
Dong Quan Vu
Patrick Loiseau
Alonso Silva
Long Tran-Thanh
12
5
0
27 May 2019
Introduction to Multi-Armed Bandits
Introduction to Multi-Armed Bandits
Aleksandrs Slivkins
677
1,024
0
15 Apr 2019
Adversarial Bandits with Knapsacks
Adversarial Bandits with Knapsacks
Nicole Immorlica
Karthik Abinav Sankararaman
Robert Schapire
Aleksandrs Slivkins
186
116
0
28 Nov 2018
Online Non-Additive Path Learning under Full and Partial Information
Online Non-Additive Path Learning under Full and Partial Information
Corinna Cortes
Vitaly Kuznetsov
M. Mohri
Holakou Rahmanian
Manfred K. Warmuth
OffRL
16
0
0
18 Apr 2018
Minimal Exploration in Structured Stochastic Bandits
Minimal Exploration in Structured Stochastic Bandits
Richard Combes
Stefan Magureanu
Alexandre Proutiere
449
119
0
01 Nov 2017
Online Dynamic Programming
Online Dynamic Programming
Holakou Rahmanian
Manfred K. Warmuth
16
14
0
02 Jun 2017
Combinatorial Semi-Bandits with Knapsacks
Combinatorial Semi-Bandits with Knapsacks
Karthik Abinav Sankararaman
Aleksandrs Slivkins
80
50
0
23 May 2017
Tight Bounds for Bandit Combinatorial Optimization
Tight Bounds for Bandit Combinatorial Optimization
Alon Cohen
Tamir Hazan
Tomer Koren
166
22
0
24 Feb 2017
Data Driven SMART Intercontinental Overlay Networks
Data Driven SMART Intercontinental Overlay Networks
O. Brun
Lan Wang
E. Gelenbe
24
3
0
28 Dec 2015
Importance weighting without importance weights: An efficient algorithm
  for combinatorial semi-bandits
Importance weighting without importance weights: An efficient algorithm for combinatorial semi-bandits
Gergely Neu
Gábor Bartók
117
37
0
17 Mar 2015
First-order regret bounds for combinatorial semi-bandits
First-order regret bounds for combinatorial semi-bandits
Gergely Neu
218
59
0
23 Feb 2015
Contextual Semibandits via Supervised Learning Oracles
Contextual Semibandits via Supervised Learning Oracles
A. Krishnamurthy
Alekh Agarwal
Miroslav Dudík
OffRL
170
21
0
20 Feb 2015
Combinatorial Bandits Revisited
Combinatorial Bandits Revisited
Richard Combes
M. Sadegh
Marc Lelarge@ens Fr
Marc Lelarge
37
5
0
11 Feb 2015
Online learning in MDPs with side information
Online learning in MDPs with side information
Yasin Abbasi-Yadkori
Gergely Neu
OffRL
88
18
0
26 Jun 2014
Fundamental Limits of Online and Distributed Algorithms for Statistical
  Learning and Estimation
Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation
Ohad Shamir
127
108
0
14 Nov 2013
Stochastic Online Shortest Path Routing: The Value of Feedback
Stochastic Online Shortest Path Routing: The Value of Feedback
M. Sadegh Talebi
Zhenhua Zou
Richard Combes
Alexandre Proutiere
M. Johansson
57
14
0
27 Sep 2013
An efficient algorithm for learning with semi-bandit feedback
An efficient algorithm for learning with semi-bandit feedback
Gergely Neu
Gábor Bartók
145
80
0
13 May 2013
Deterministic MDPs with Adversarial Rewards and Bandit Feedback
Deterministic MDPs with Adversarial Rewards and Bandit Feedback
R. Arora
O. Dekel
Ambuj Tewari
102
32
0
16 Oct 2012
Regret in Online Combinatorial Optimization
Regret in Online Combinatorial Optimization
Jean-Yves Audibert
Sébastien Bubeck
Gábor Lugosi
OffRL
119
258
0
20 Apr 2012
Minimax Policies for Combinatorial Prediction Games
Minimax Policies for Combinatorial Prediction Games
Jean-Yves Audibert
Sébastien Bubeck
Gabor Lugosi
OffRL
204
82
0
24 May 2011
Online Multi-task Learning with Hard Constraints
Online Multi-task Learning with Hard Constraints
Gabor Lugosi
O. Papaspiliopoulos
Gilles Stoltz
VOT
66
49
0
20 Feb 2009
1