ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.04498
34
1
v1v2 (latest)

Optimal Fair Multi-Agent Bandits

7 June 2023
Amir Leshem
    FedMLFaML
ArXiv (abs)PDFHTML
Abstract

In this paper, we study the problem of fair multi-agent multi-arm bandit learning when agents do not communicate with each other, except collision information, provided to agents accessing the same arm simultaneously. We provide an algorithm with regret O(N3log⁡Nlog⁡T)O\left(N^3 \log N \log T \right)O(N3logNlogT) (assuming bounded rewards, with unknown bound). This significantly improves previous results which had regret of order O(log⁡Tlog⁡log⁡T)O(\log T \log\log T)O(logTloglogT) and exponential dependence on the number of agents. The result is attained by using a distributed auction algorithm to learn the sample-optimal matching, a new type of exploitation phase whose length is derived from the observed samples, and a novel order-statistics-based regret analysis. Simulation results present the dependence of the regret on log⁡T\log TlogT.

View on arXiv
Comments on this paper