The on-line shortest path problem under partial monitoring

Journal of machine learning research (JMLR), 2007

8 April 2007

Papers citing "The on-line shortest path problem under partial monitoring"

50 / 51 papers shown

Title
Optimal Arm Elimination Algorithms for Combinatorial Bandits Yuxiao Wen Yanjun Han Zhengyuan Zhou 4 0 0 28 Oct 2025
Efficient Kernelized Learning in Polyhedral Games Beyond Full-Information: From Colonel Blotto to Congestion Games Andreas Kontogiannis Vasilis Pollatos Gabriele Farina Panayotis Mertikopoulos Ioannis Panageas 76 0 0 25 Sep 2025
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive AdversariesAnnual Conference Computational Learning Theory (COLT), 2025 Arnab Maiti Zhiyuan Fan Kevin Jamieson Lillian J. Ratliff Gabriele Farina 611 4 0 01 Apr 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure Aleksandrs Slivkins Yunzong Xu Shiliang Zuo 673 1 0 06 Mar 2025
Adversarial Combinatorial Semi-bandits with Graph Feedback Yuxiao Wen 354 1 0 26 Feb 2025
Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm Ryotaro Mitsuboshi Kohei Hatano Eiji Takimoto 149 0 0 19 Oct 2023
Taming the Exponential Action Set: Sublinear Regret and Fast Convergence to Nash Equilibrium in Online Congestion Games Jing Dong Jingyu Wu Si-Yi Wang Baoxiang Wang Wei Chen 141 4 0 19 Jun 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement LearningConference on Learning for Dynamics & Control (L4DC), 2023 Dongsheng Ding Xiaohan Wei Zhuoran Yang Zhaoran Wang Mihailo R. Jovanović OffRL 193 12 0 31 May 2023
Learning and Collusion in Multi-unit AuctionsNeural Information Processing Systems (NeurIPS), 2023 Simina Brânzei Mahsa Derakhshan Negin Golrezaei Yanjun Han 144 7 0 27 May 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPsAnnual Conference Computational Learning Theory (COLT), 2023 Dirk van der Hoeven Lukas Zierahn Tal Lancewicki Aviv A. Rosenberg Nicolò Cesa-Bianchi 150 8 0 15 May 2023
Fully Dynamic Online Selection through Online Contention Resolution SchemesAAAI Conference on Artificial Intelligence (AAAI), 2023 Vashist Avadhanula A. Celli Riccardo Colini-Baldeschi S. Leonardi M. Russo 190 1 0 08 Jan 2023
Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot CoordinationIEEE Robotics and Automation Letters (RA-L), 2022 Zirui Xu Hongyu Zhou Vasileios Tzoumas 168 11 0 26 Sep 2022
Nested banditsInternational Conference on Machine Learning (ICML), 2022 Matthieu Martin P. Mertikopoulos Thibaud Rahier Houssam Zenati 99 3 0 19 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards OptimalityConference on Uncertainty in Artificial Intelligence (UAI), 2022 Ming Yin Wenjing Chen Mengdi Wang Yu Wang OffRL 121 6 0 10 Jun 2022
Individually Fair Learning with One-Sided FeedbackInternational Conference on Machine Learning (ICML), 2022 Yahav Bechavod Aaron Roth FaML 117 4 0 09 Jun 2022
Incentivizing Combinatorial Bandit ExplorationNeural Information Processing Systems (NeurIPS), 2022 Xinyan Hu Dung Daniel Ngo Aleksandrs Slivkins Zhiwei Steven Wu 98 13 0 01 Jun 2022
Fast online inference for nonlinear contextual bandit based on Generative Adversarial Network Yun-Da Tsai Shou-De Lin 136 6 0 17 Feb 2022
Adversarial Online Learning with Variable Plays in the Pursuit-Evasion Game: Theoretical Foundations and Application in Connected and Automated Vehicle Cybersecurity Yiyang Wang Neda Masoud AAML 87 4 0 26 Oct 2021
Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base PolytopesNeural Information Processing Systems (NeurIPS), 2021 Jai Moondra Hassan Mortagy Swati Gupta 197 4 0 22 Jun 2021
Contextual Recommendations and Low-Regret Cutting-Plane AlgorithmsNeural Information Processing Systems (NeurIPS), 2021 Sreenivas Gollapudi Guru Guruganesh Kostas Kollias Pasin Manurangsi R. Leme Jon Schneider 107 6 0 09 Jun 2021
Bandit Linear Optimization for Sequential Decision Making and Extensive-Form GamesAAAI Conference on Artificial Intelligence (AAAI), 2021 Gabriele Farina Robin Schmucker Tuomas Sandholm 224 22 0 08 Mar 2021
Model-Free Online Learning in Unknown Sequential Decision Making Problems and GamesAAAI Conference on Artificial Intelligence (AAAI), 2021 Gabriele Farina Tuomas Sandholm OffRL 137 22 0 08 Mar 2021
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPsNeural Information Processing Systems (NeurIPS), 2020 Chung-Wei Lee Haipeng Luo Chen-Yu Wei Mengxiao Zhang 290 57 0 14 Jun 2020
Contextual Blocking BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Soumya Basu Orestis Papadigenopoulos Constantine Caramanis Sanjay Shakkottai 146 22 0 06 Mar 2020
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial LossNeural Information Processing Systems (NeurIPS), 2020 Delin Qu Xiaohan Wei Zhuoran Yang Jieping Ye Zhaoran Wang 311 55 0 02 Mar 2020
Combinatorial Semi-Bandit in the Non-Stationary EnvironmentConference on Uncertainty in Artificial Intelligence (UAI), 2020 Wei Chen Liwei Wang Haoyu Zhao Kai Zheng 158 23 0 10 Feb 2020
No-Regret Exploration in Goal-Oriented Reinforcement LearningInternational Conference on Machine Learning (ICML), 2019 Jean Tarbouriech Evrard Garcelon Michal Valko Matteo Pirotta A. Lazaric 212 48 0 07 Dec 2019
Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple PlaysIEEE Transactions on Signal Processing (IEEE Trans. Signal Process.), 2019 Nuri Mert Vural Hakan Gokcesu Kaan Gokcesu Suleyman S. Kozat 108 19 0 25 Nov 2019
Blocking BanditsNeural Information Processing Systems (NeurIPS), 2019 Soumya Basu Rajat Sen Sujay Sanghavi Sanjay Shakkottai 110 39 0 27 Jul 2019
Exploration by Optimisation in Partial MonitoringAnnual Conference Computational Learning Theory (COLT), 2019 Tor Lattimore Csaba Szepesvári 152 38 0 12 Jul 2019
Path Planning Problems with Side Observations-When Colonels Play Hide-and-SeekAAAI Conference on Artificial Intelligence (AAAI), 2019 Dong Quan Vu Patrick Loiseau Alonso Silva Long Tran-Thanh 196 6 0 27 May 2019
Introduction to Multi-Armed Bandits Aleksandrs Slivkins 1.0K 1,124 0 15 Apr 2019
Adversarial Bandits with Knapsacks Nicole Immorlica Karthik Abinav Sankararaman Robert Schapire Aleksandrs Slivkins 334 125 0 28 Nov 2018
Online Non-Additive Path Learning under Full and Partial Information Corinna Cortes Vitaly Kuznetsov M. Mohri Holakou Rahmanian Manfred K. Warmuth OffRL 152 1 0 18 Apr 2018
Minimal Exploration in Structured Stochastic Bandits Richard Combes Stefan Magureanu Alexandre Proutiere 543 123 0 01 Nov 2017
Online Dynamic ProgrammingNeural Information Processing Systems (NeurIPS), 2017 Holakou Rahmanian Manfred K. Warmuth S.V.N. Vishwanathan 153 15 0 02 Jun 2017
Combinatorial Semi-Bandits with Knapsacks Karthik Abinav Sankararaman Aleksandrs Slivkins 173 54 0 23 May 2017
Tight Bounds for Bandit Combinatorial OptimizationAnnual Conference Computational Learning Theory (COLT), 2017 Alon Cohen Tamir Hazan Tomer Koren 242 24 0 24 Feb 2017
Data Driven SMART Intercontinental Overlay Networks O. Brun Lan Wang E. Gelenbe 78 3 0 28 Dec 2015
Importance weighting without importance weights: An efficient algorithm for combinatorial semi-bandits Gergely Neu Gábor Bartók 169 41 0 17 Mar 2015
First-order regret bounds for combinatorial semi-bandits Gergely Neu 283 62 0 23 Feb 2015
Contextual Semibandits via Supervised Learning Oracles A. Krishnamurthy Alekh Agarwal Miroslav Dudík OffRL 347 21 0 20 Feb 2015
Combinatorial Bandits Revisited Richard Combes M. Sadegh Marc Lelarge@ens Fr Marc Lelarge 161 5 0 11 Feb 2015
Online learning in MDPs with side information Yasin Abbasi-Yadkori Gergely Neu OffRL 155 18 0 26 Jun 2014
Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and EstimationNeural Information Processing Systems (NeurIPS), 2013 Ohad Shamir 351 109 0 14 Nov 2013
Stochastic Online Shortest Path Routing: The Value of FeedbackAmerican Control Conference (ACC), 2013 M. Sadegh Talebi Zhenhua Zou Richard Combes Alexandre Proutiere M. Johansson 243 14 0 27 Sep 2013
An efficient algorithm for learning with semi-bandit feedbackInternational Conference on Algorithmic Learning Theory (ALT), 2013 Gergely Neu Gábor Bartók 184 83 0 13 May 2013
Deterministic MDPs with Adversarial Rewards and Bandit FeedbackConference on Uncertainty in Artificial Intelligence (UAI), 2012 R. Arora O. Dekel Ambuj Tewari 179 32 0 16 Oct 2012
Regret in Online Combinatorial OptimizationMathematics of Operations Research (MOR), 2012 Jean-Yves Audibert Sébastien Bubeck Gábor Lugosi OffRL 193 268 0 20 Apr 2012
Minimax Policies for Combinatorial Prediction GamesAnnual Conference Computational Learning Theory (COLT), 2011 Jean-Yves Audibert Sébastien Bubeck Gabor Lugosi OffRL 285 84 0 24 May 2011