126

Learning-based Scheduling for Information Accuracy and Freshness in Wireless Networks

International Conference on Signal Processing and Communications (ICSPC), 2023
Main:8 Pages
6 Figures
Bibliography:2 Pages
2 Tables
Appendix:11 Pages
Abstract

We consider a system of multiple sources, a single communication channel, and a single monitoring station. Each source measures a time-varying quantity with varying levels of accuracy and one of them sends its update to the monitoring station via the channel. The probability of success of each attempted communication is a function of the source scheduled for transmitting its update. Both the probability of correct measurement and the probability of successful transmission of all the sources are unknown to the scheduler. The metric of interest is the reward received by the system which depends on the accuracy of the last update received by the destination and the Age-of-Information (AoI) of the system. We model our scheduling problem as a variant of the multi-arm bandit problem with sources as different arms. We compare the performance of all 44 standard bandit policies, namely, ETC, ϵ\epsilon-greedy, UCB, and TS suitably adjusted to our system model via simulations. In addition, we provide analytical guarantees of 22 of these policies, ETC, and ϵ\epsilon-greedy. Finally, we characterize the lower bound on the cumulative regret achievable by any policy.

View on arXiv
Comments on this paper