ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 0704.1020
  4. Cited By
The on-line shortest path problem under partial monitoring

The on-line shortest path problem under partial monitoring

Journal of machine learning research (JMLR), 2007
8 April 2007
Pál Benkö
T. Várady
L. Andor
Ralph Robert Martin
ArXiv (abs)PDFHTML

Papers citing "The on-line shortest path problem under partial monitoring"

50 / 51 papers shown
Title
Optimal Arm Elimination Algorithms for Combinatorial Bandits
Optimal Arm Elimination Algorithms for Combinatorial Bandits
Yuxiao Wen
Yanjun Han
Zhengyuan Zhou
4
0
0
28 Oct 2025
Efficient Kernelized Learning in Polyhedral Games Beyond Full-Information: From Colonel Blotto to Congestion Games
Efficient Kernelized Learning in Polyhedral Games Beyond Full-Information: From Colonel Blotto to Congestion Games
Andreas Kontogiannis
Vasilis Pollatos
Gabriele Farina
Panayotis Mertikopoulos
Ioannis Panageas
76
0
0
25 Sep 2025
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive AdversariesAnnual Conference Computational Learning Theory (COLT), 2025
Arnab Maiti
Zhiyuan Fan
Kevin Jamieson
Lillian J. Ratliff
Gabriele Farina
611
4
0
01 Apr 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Aleksandrs Slivkins
Yunzong Xu
Shiliang Zuo
673
1
0
06 Mar 2025
Adversarial Combinatorial Semi-bandits with Graph Feedback
Adversarial Combinatorial Semi-bandits with Graph Feedback
Yuxiao Wen
354
1
0
26 Feb 2025
Online Combinatorial Linear Optimization via a Frank-Wolfe-based
  Metarounding Algorithm
Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm
Ryotaro Mitsuboshi
Kohei Hatano
Eiji Takimoto
149
0
0
19 Oct 2023
Taming the Exponential Action Set: Sublinear Regret and Fast Convergence
  to Nash Equilibrium in Online Congestion Games
Taming the Exponential Action Set: Sublinear Regret and Fast Convergence to Nash Equilibrium in Online Congestion Games
Jing Dong
Jingyu Wu
Si-Yi Wang
Baoxiang Wang
Wei Chen
141
4
0
19 Jun 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe
  Multi-Agent Reinforcement Learning
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement LearningConference on Learning for Dynamics & Control (L4DC), 2023
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
Mihailo R. Jovanović
OffRL
193
12
0
31 May 2023
Learning and Collusion in Multi-unit Auctions
Learning and Collusion in Multi-unit AuctionsNeural Information Processing Systems (NeurIPS), 2023
Simina Brânzei
Mahsa Derakhshan
Negin Golrezaei
Yanjun Han
144
7
0
27 May 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial
  Semi-Bandits, Linear Bandits, and MDPs
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPsAnnual Conference Computational Learning Theory (COLT), 2023
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
150
8
0
15 May 2023
Fully Dynamic Online Selection through Online Contention Resolution
  Schemes
Fully Dynamic Online Selection through Online Contention Resolution SchemesAAAI Conference on Artificial Intelligence (AAAI), 2023
Vashist Avadhanula
A. Celli
Riccardo Colini-Baldeschi
S. Leonardi
M. Russo
190
1
0
08 Jan 2023
Online Submodular Coordination with Bounded Tracking Regret: Theory,
  Algorithm, and Applications to Multi-Robot Coordination
Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot CoordinationIEEE Robotics and Automation Letters (RA-L), 2022
Zirui Xu
Hongyu Zhou
Vasileios Tzoumas
168
11
0
26 Sep 2022
Nested bandits
Nested banditsInternational Conference on Machine Learning (ICML), 2022
Matthieu Martin
P. Mertikopoulos
Thibaud Rahier
Houssam Zenati
99
3
0
19 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards
  Optimality
Offline Stochastic Shortest Path: Learning, Evaluation and Towards OptimalityConference on Uncertainty in Artificial Intelligence (UAI), 2022
Ming Yin
Wenjing Chen
Mengdi Wang
Yu Wang
OffRL
121
6
0
10 Jun 2022
Individually Fair Learning with One-Sided Feedback
Individually Fair Learning with One-Sided FeedbackInternational Conference on Machine Learning (ICML), 2022
Yahav Bechavod
Aaron Roth
FaML
117
4
0
09 Jun 2022
Incentivizing Combinatorial Bandit Exploration
Incentivizing Combinatorial Bandit ExplorationNeural Information Processing Systems (NeurIPS), 2022
Xinyan Hu
Dung Daniel Ngo
Aleksandrs Slivkins
Zhiwei Steven Wu
98
13
0
01 Jun 2022
Fast online inference for nonlinear contextual bandit based on
  Generative Adversarial Network
Fast online inference for nonlinear contextual bandit based on Generative Adversarial Network
Yun-Da Tsai
Shou-De Lin
136
6
0
17 Feb 2022
Adversarial Online Learning with Variable Plays in the Pursuit-Evasion
  Game: Theoretical Foundations and Application in Connected and Automated
  Vehicle Cybersecurity
Adversarial Online Learning with Variable Plays in the Pursuit-Evasion Game: Theoretical Foundations and Application in Connected and Automated Vehicle Cybersecurity
Yiyang Wang
Neda Masoud
AAML
87
4
0
26 Oct 2021
Reusing Combinatorial Structure: Faster Iterative Projections over
  Submodular Base Polytopes
Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base PolytopesNeural Information Processing Systems (NeurIPS), 2021
Jai Moondra
Hassan Mortagy
Swati Gupta
197
4
0
22 Jun 2021
Contextual Recommendations and Low-Regret Cutting-Plane Algorithms
Contextual Recommendations and Low-Regret Cutting-Plane AlgorithmsNeural Information Processing Systems (NeurIPS), 2021
Sreenivas Gollapudi
Guru Guruganesh
Kostas Kollias
Pasin Manurangsi
R. Leme
Jon Schneider
107
6
0
09 Jun 2021
Bandit Linear Optimization for Sequential Decision Making and
  Extensive-Form Games
Bandit Linear Optimization for Sequential Decision Making and Extensive-Form GamesAAAI Conference on Artificial Intelligence (AAAI), 2021
Gabriele Farina
Robin Schmucker
Tuomas Sandholm
224
22
0
08 Mar 2021
Model-Free Online Learning in Unknown Sequential Decision Making
  Problems and Games
Model-Free Online Learning in Unknown Sequential Decision Making Problems and GamesAAAI Conference on Artificial Intelligence (AAAI), 2021
Gabriele Farina
Tuomas Sandholm
OffRL
137
22
0
08 Mar 2021
Bias no more: high-probability data-dependent regret bounds for
  adversarial bandits and MDPs
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPsNeural Information Processing Systems (NeurIPS), 2020
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
Mengxiao Zhang
290
57
0
14 Jun 2020
Contextual Blocking Bandits
Contextual Blocking BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Soumya Basu
Orestis Papadigenopoulos
Constantine Caramanis
Sanjay Shakkottai
146
22
0
06 Mar 2020
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with
  Adversarial Loss
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial LossNeural Information Processing Systems (NeurIPS), 2020
Delin Qu
Xiaohan Wei
Zhuoran Yang
Jieping Ye
Zhaoran Wang
311
55
0
02 Mar 2020
Combinatorial Semi-Bandit in the Non-Stationary Environment
Combinatorial Semi-Bandit in the Non-Stationary EnvironmentConference on Uncertainty in Artificial Intelligence (UAI), 2020
Wei Chen
Liwei Wang
Haoyu Zhao
Kai Zheng
158
23
0
10 Feb 2020
No-Regret Exploration in Goal-Oriented Reinforcement Learning
No-Regret Exploration in Goal-Oriented Reinforcement LearningInternational Conference on Machine Learning (ICML), 2019
Jean Tarbouriech
Evrard Garcelon
Michal Valko
Matteo Pirotta
A. Lazaric
212
48
0
07 Dec 2019
Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple
  Plays
Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple PlaysIEEE Transactions on Signal Processing (IEEE Trans. Signal Process.), 2019
Nuri Mert Vural
Hakan Gokcesu
Kaan Gokcesu
Suleyman S. Kozat
108
19
0
25 Nov 2019
Blocking Bandits
Blocking BanditsNeural Information Processing Systems (NeurIPS), 2019
Soumya Basu
Rajat Sen
Sujay Sanghavi
Sanjay Shakkottai
110
39
0
27 Jul 2019
Exploration by Optimisation in Partial Monitoring
Exploration by Optimisation in Partial MonitoringAnnual Conference Computational Learning Theory (COLT), 2019
Tor Lattimore
Csaba Szepesvári
152
38
0
12 Jul 2019
Path Planning Problems with Side Observations-When Colonels Play
  Hide-and-Seek
Path Planning Problems with Side Observations-When Colonels Play Hide-and-SeekAAAI Conference on Artificial Intelligence (AAAI), 2019
Dong Quan Vu
Patrick Loiseau
Alonso Silva
Long Tran-Thanh
196
6
0
27 May 2019
Introduction to Multi-Armed Bandits
Introduction to Multi-Armed Bandits
Aleksandrs Slivkins
1.0K
1,124
0
15 Apr 2019
Adversarial Bandits with Knapsacks
Adversarial Bandits with Knapsacks
Nicole Immorlica
Karthik Abinav Sankararaman
Robert Schapire
Aleksandrs Slivkins
334
125
0
28 Nov 2018
Online Non-Additive Path Learning under Full and Partial Information
Online Non-Additive Path Learning under Full and Partial Information
Corinna Cortes
Vitaly Kuznetsov
M. Mohri
Holakou Rahmanian
Manfred K. Warmuth
OffRL
152
1
0
18 Apr 2018
Minimal Exploration in Structured Stochastic Bandits
Minimal Exploration in Structured Stochastic Bandits
Richard Combes
Stefan Magureanu
Alexandre Proutiere
543
123
0
01 Nov 2017
Online Dynamic Programming
Online Dynamic ProgrammingNeural Information Processing Systems (NeurIPS), 2017
Holakou Rahmanian
Manfred K. Warmuth
S.V.N. Vishwanathan
153
15
0
02 Jun 2017
Combinatorial Semi-Bandits with Knapsacks
Combinatorial Semi-Bandits with Knapsacks
Karthik Abinav Sankararaman
Aleksandrs Slivkins
173
54
0
23 May 2017
Tight Bounds for Bandit Combinatorial Optimization
Tight Bounds for Bandit Combinatorial OptimizationAnnual Conference Computational Learning Theory (COLT), 2017
Alon Cohen
Tamir Hazan
Tomer Koren
242
24
0
24 Feb 2017
Data Driven SMART Intercontinental Overlay Networks
Data Driven SMART Intercontinental Overlay Networks
O. Brun
Lan Wang
E. Gelenbe
78
3
0
28 Dec 2015
Importance weighting without importance weights: An efficient algorithm
  for combinatorial semi-bandits
Importance weighting without importance weights: An efficient algorithm for combinatorial semi-bandits
Gergely Neu
Gábor Bartók
169
41
0
17 Mar 2015
First-order regret bounds for combinatorial semi-bandits
First-order regret bounds for combinatorial semi-bandits
Gergely Neu
283
62
0
23 Feb 2015
Contextual Semibandits via Supervised Learning Oracles
Contextual Semibandits via Supervised Learning Oracles
A. Krishnamurthy
Alekh Agarwal
Miroslav Dudík
OffRL
347
21
0
20 Feb 2015
Combinatorial Bandits Revisited
Combinatorial Bandits Revisited
Richard Combes
M. Sadegh
Marc Lelarge@ens Fr
Marc Lelarge
161
5
0
11 Feb 2015
Online learning in MDPs with side information
Online learning in MDPs with side information
Yasin Abbasi-Yadkori
Gergely Neu
OffRL
155
18
0
26 Jun 2014
Fundamental Limits of Online and Distributed Algorithms for Statistical
  Learning and Estimation
Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and EstimationNeural Information Processing Systems (NeurIPS), 2013
Ohad Shamir
351
109
0
14 Nov 2013
Stochastic Online Shortest Path Routing: The Value of Feedback
Stochastic Online Shortest Path Routing: The Value of FeedbackAmerican Control Conference (ACC), 2013
M. Sadegh Talebi
Zhenhua Zou
Richard Combes
Alexandre Proutiere
M. Johansson
243
14
0
27 Sep 2013
An efficient algorithm for learning with semi-bandit feedback
An efficient algorithm for learning with semi-bandit feedbackInternational Conference on Algorithmic Learning Theory (ALT), 2013
Gergely Neu
Gábor Bartók
184
83
0
13 May 2013
Deterministic MDPs with Adversarial Rewards and Bandit Feedback
Deterministic MDPs with Adversarial Rewards and Bandit FeedbackConference on Uncertainty in Artificial Intelligence (UAI), 2012
R. Arora
O. Dekel
Ambuj Tewari
179
32
0
16 Oct 2012
Regret in Online Combinatorial Optimization
Regret in Online Combinatorial OptimizationMathematics of Operations Research (MOR), 2012
Jean-Yves Audibert
Sébastien Bubeck
Gábor Lugosi
OffRL
193
268
0
20 Apr 2012
Minimax Policies for Combinatorial Prediction Games
Minimax Policies for Combinatorial Prediction GamesAnnual Conference Computational Learning Theory (COLT), 2011
Jean-Yves Audibert
Sébastien Bubeck
Gabor Lugosi
OffRL
285
84
0
24 May 2011
12
Next