The development of mobility-on-demand services, rich transportation data sources, and autonomous vehicles (AVs) creates significant opportunities for shared-use AV mobility services (SAMSs) to provide accessible and demand-responsive personal mobility. SAMS fleet operation involves multiple interrelated decisions, with a primary focus on efficiently fulfilling passenger ride requests with a high level of service quality. This paper focuses on improving the efficiency and service quality of a SAMS vehicle fleet via anticipatory repositioning of idle vehicles. The rebalancing problem is formulated as a Markov Decision Process, which we propose solving using an advantage actor critic (A2C) reinforcement learning-based method. The proposed approach learns a rebalancing policy that anticipates future demand and cooperates with an optimization-based assignment strategy. The approach allows for centralized repositioning decisions and can handle large vehicle fleets since the problem size does not change with the fleet size. Using New York City taxi data and an agent-based simulation tool, two versions of the A2C AV repositioning approach are tested. The first version, A2C-AVR(A), learns to anticipate future demand based on past observations, while the second, A2C-AVR(B), uses demand forecasts. The models are compared to an optimization-based rebalancing approach and show significant reduction in mean passenger waiting times, with a slightly increased percentage of empty fleet miles travelled. The experiments demonstrate the model's ability to anticipate future demand and its transferability to cases unseen at the training stage.
View on arXiv