ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.03323
30
0

Unraveling the Rainbow: can value-based methods schedule?

6 May 2025
Arthur Corrêa
Alexandre Jesus
Cristóvão Silva
Samuel Moniz
    OffRL
ArXivPDFHTML
Abstract

Recently, deep reinforcement learning has emerged as a promising approach for solving complex combinatorial optimization problems. Broadly, deep reinforcement learning methods fall into two categories: policy-based and value-based. While value-based approaches have achieved notable success in domains such as the Arcade Learning Environment, the combinatorial optimization community has predominantly favored policy-based methods, often overlooking the potential of value-based algorithms. In this work, we conduct a comprehensive empirical evaluation of value-based algorithms, including the deep q-network and several of its advanced extensions, within the context of two complex combinatorial problems: the job-shop and the flexible job-shop scheduling problems, two fundamental challenges with multiple industrial applications. Our results challenge the assumption that policy-based methods are inherently superior for combinatorial optimization. We show that several value-based approaches can match or even outperform the widely adopted proximal policy optimization algorithm, suggesting that value-based strategies deserve greater attention from the combinatorial optimization community. Our code is openly available at:this https URL.

View on arXiv
@article{corrêa2025_2505.03323,
  title={ Unraveling the Rainbow: can value-based methods schedule? },
  author={ Arthur Corrêa and Alexandre Jesus and Cristóvão Silva and Samuel Moniz },
  journal={arXiv preprint arXiv:2505.03323},
  year={ 2025 }
}
Comments on this paper