A Role of Environmental Complexity on Representation Learning in Deep Reinforcement Learning Agents

We developed a simulated environment to train deep reinforcement learning agents on a shortcut usage navigation task, motivated by the Dual Solutions Paradigm test used for human navigators. We manipulated the frequency with which agents were exposed to a shortcut and a navigation cue, to investigate how these factors influence shortcut usage development. We find that all agents rapidly achieve optimal performance in closed shortcut trials once initial learning starts. However, their navigation speed and shortcut usage when it is open happen faster in agents with higher shortcut exposure. Analysis of the agents' artificial neural networks activity revealed that frequent presentation of a cue initially resulted in better encoding of the cue in the activity of individual nodes, compared to agents who encountered the cue less often. However, stronger cue representations were ultimately formed through the use of the cue in the context of navigation planning, rather than simply through exposure. We found that in all agents, spatial representations develop early in training and subsequently stabilize before navigation strategies fully develop, suggesting that having spatially consistent activations is necessary for basic navigation, but insufficient for advanced strategies. Further, using new analysis techniques, we found that the planned trajectory rather than the agent's immediate location is encoded in the agent's networks. Moreover, the encoding is represented at the population rather than the individual node level. These techniques could have broader applications in studying neural activity across populations of neurons or network nodes beyond individual activity patterns.
View on arXiv@article{liu2025_2407.03436, title={ A Role of Environmental Complexity on Representation Learning in Deep Reinforcement Learning Agents }, author={ Andrew Liu and Alla Borisyuk }, journal={arXiv preprint arXiv:2407.03436}, year={ 2025 } }