121

Scalable Reinforcement Learning for Virtual Machine Scheduling

Abstract

Recent advancements in reinforcement learning (RL) have shown promise for optimizing virtual machine scheduling (VMS) in small-scale clusters. The utilization of RL to large-scale cloud computing scenarios remains notably constrained. This paper introduces a scalable RL framework, called Cluster Value Decomposition Reinforcement Learning (CVD-RL), to surmount the scalability hurdles inherent in large-scale VMS. The CVD-RL framework innovatively combines a decomposition operator with a look-ahead operator to adeptly manage representation complexities, while complemented by a Top-kk filter operator that refines exploration efficiency. Different from existing approaches limited to clusters of 1010 or fewer physical machines (PMs), CVD-RL extends its applicability to environments encompassing up to 5050 PMs. Furthermore, the CVD-RL framework demonstrates generalization capabilities that surpass contemporary SOTA methodologies across a variety of scenarios in empirical studies. This breakthrough not only showcases the framework's exceptional scalability and performance but also represents a significant leap in the application of RL for VMS within complex, large-scale cloud infrastructures. The code is available atthis https URL.

View on arXiv
Main:19 Pages
13 Figures
Bibliography:3 Pages
4 Tables
Appendix:1 Pages
Comments on this paper