ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.17666
64
0

Yes, Q-learning Helps Offline In-Context RL

24 February 2025
Denis Tarasov
Alexander Nikulin
Ilya Zisman
Albina Klepach
Andrei Polubarov
Nikita Lyubaykin
Alexander Derevyagin
Igor Kiselev
Vladislav Kurenkov
    OffRL
    OnRL
ArXivPDFHTML
Abstract

In this work, we explore the integration of Reinforcement Learning (RL) approaches within a scalable offline In-Context RL (ICRL) framework. Through experiments across more than 150 datasets derived from GridWorld and MuJoCo environments, we demonstrate that optimizing RL objectives improves performance by approximately 40% on average compared to the widely established Algorithm Distillation (AD) baseline across various dataset coverages, structures, expertise levels, and environmental complexities. Our results also reveal that offline RL-based methods outperform online approaches, which are not specifically designed for offline scenarios. These findings underscore the importance of aligning the learning objectives with RL's reward-maximization goal and demonstrate that offline RL is a promising direction for application in ICRL settings.

View on arXiv
@article{tarasov2025_2502.17666,
  title={ Yes, Q-learning Helps Offline In-Context RL },
  author={ Denis Tarasov and Alexander Nikulin and Ilya Zisman and Albina Klepach and Andrei Polubarov and Nikita Lyubaykin and Alexander Derevyagin and Igor Kiselev and Vladislav Kurenkov },
  journal={arXiv preprint arXiv:2502.17666},
  year={ 2025 }
}
Comments on this paper