MDPs with a State Sensing Cost

6 May 2025

Vansh Kapoor

Jayakrishnan Nair

ArXiv (abs)PDF HTML

Main:8 Pages

5 Figures

Bibliography:2 Pages

3 Tables

Appendix:11 Pages

Abstract

In many practical sequential decision-making problems, tracking the state of the environment incurs a sensing/communication/computation cost. In these settings, the agent's interaction with its environment includes the additional component of deciding $\textit{when}$ to sense the state, in a manner that balances the value associated with optimal (state-specific) actions and the cost of sensing. We formulate this as an expected discounted cost Markov Decision Process (MDP), wherein the agent incurs an additional cost for sensing its next state, but has the option to take actions while remaining 'blind' to the system state.

View on arXiv

Comments on this paper