Value Iteration for Learning Concurrently Executable Robotic Control Tasks

Many modern robotic systems such as multi-robot systems and manipulators exhibit redundancy, a property owing to which they are capable of executing multiple tasks. This work proposes a novel method, based on the Reinforcement Learning (RL) paradigm, to train redundant robots to be able to execute multiple tasks concurrently. Our approach differs from typical multi-objective RL methods insofar as the learned tasks can be combined and executed in possibly time-varying prioritized stacks. We do so by first defining a notion of task independence between learned value functions. We then use our definition of task independence to propose a cost functional that encourages a policy, based on an approximated value function, to accomplish its control objective while minimally interfering with the execution of higher priority tasks. This allows us to train a set of control policies that can be executed simultaneously. We also introduce a version of fitted value iteration to learn to approximate our proposed cost functional efficiently. We demonstrate our approach on several scenarios and robotic systems.
View on arXiv@article{tahmid2025_2504.01174, title={ Value Iteration for Learning Concurrently Executable Robotic Control Tasks }, author={ Sheikh A. Tahmid and Gennaro Notomista }, journal={arXiv preprint arXiv:2504.01174}, year={ 2025 } }