ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11628
20
21

ααα^ααα-Rank: Practically Scaling ααα-Rank through Stochastic Optimisation

25 September 2019
Yaodong Yang
Rasul Tutunov
Phu Sakulwongtana
Haitham Bou-Ammar
ArXivPDFHTML
Abstract

Recently, α\alphaα-Rank, a graph-based algorithm, has been proposed as a solution to ranking joint policy profiles in large scale multi-agent systems. α\alphaα-Rank claimed tractability through a polynomial time implementation with respect to the total number of pure strategy profiles. Here, we note that inputs to the algorithm were not clearly specified in the original presentation; as such, we deem complexity claims as not grounded, and conjecture solving α\alphaα-Rank is NP-hard. The authors of α\alphaα-Rank suggested that the input to α\alphaα-Rank can be an exponentially-sized payoff matrix; a claim promised to be clarified in subsequent manuscripts. Even though α\alphaα-Rank exhibits a polynomial-time solution with respect to such an input, we further reflect additional critical problems. We demonstrate that due to the need of constructing an exponentially large Markov chain, α\alphaα-Rank is infeasible beyond a small finite number of agents. We ground these claims by adopting amount of dollars spent as a non-refutable evaluation metric. Realising such scalability issue, we present a stochastic implementation of α\alphaα-Rank with a double oracle mechanism allowing for reductions in joint strategy spaces. Our method, αα\alpha^\alphaαα-Rank, does not need to save exponentially-large transition matrix, and can terminate early under required precision. Although theoretically our method exhibits similar worst-case complexity guarantees compared to α\alphaα-Rank, it allows us, for the first time, to practically conduct large-scale multi-agent evaluations. On 104×10410^4 \times 10^4104×104 random matrices, we achieve 1000x1000x1000x speed reduction. Furthermore, we also show successful results on large joint strategy profiles with a maximum size in the order of O(225)\mathcal{O}(2^{25})O(225) (≈33\approx 33≈33 million joint strategies) -- a setting not evaluable using α\alphaα-Rank with reasonable computational budget.

View on arXiv
Comments on this paper