Recurrent Off-Policy Deep Reinforcement Learning Doesn't Have to be Slow
Tyler Clark
Christine Evers
Jonathon Hare
- OffRLBDLRALMSSegOnRLSyDaLM&RoUQCVReLMMLAULMTDAI4ClMQMUCLLKELMSLRVOTAI4CEDMLPERPINNVLMPILMSILMLRMSSLISeg3DVAIFinOCLAI4TSMLTNAIWSOLUDAI4MHSupR3DHAILawHAIReCodLM&MAGPHILMFAttOSLMGNNELMVGenAAMLMIALMAuLLMPICVALMXAIMoEFedMLMILMCMLMedImUQLMAI4Ed3DGSCoGeViTCLIP3DPCWaLMMGenOTLLMSVMDETTAWSOD
Main:11 Pages
39 Figures
Bibliography:4 Pages
8 Tables
Appendix:21 Pages
Abstract
Recurrent off-policy deep reinforcement learning models achieve state-of-the-art performance but are often sidelined due to their high computational demands. In response, we introduce RISE (Recurrent Integration via Simplified Encodings), a novel approach that can leverage recurrent networks in any image-based off-policy RL setting without significant computational overheads via using both learnable and non-learnable encoder layers. When integrating RISE into leading non-recurrent off-policy RL algorithms, we observe a 35.6% human-normalized interquartile mean (IQM) performance improvement across the Atari benchmark. We analyze various implementation strategies to highlight the versatility and potential of our proposed framework.
View on arXivComments on this paper
