ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.04225
182
0
v1v2v3 (latest)

Symmetric Behavior Regularization via Taylor Expansion of Symmetry

6 August 2025
Lingwei Zhu
Zheng Chen
Zheng Chen
Yukie Nagai
    OffRL
ArXiv (abs)PDFHTMLGithub
Main:10 Pages
12 Figures
Bibliography:4 Pages
10 Tables
Appendix:8 Pages
Abstract

This paper introduces symmetric divergences to behavior regularization policy optimization (BRPO) to establish a novel offline RL framework. Existing methods focus on asymmetric divergences such as KL to obtain analytic regularized policies and a practical minimization objective. We show that symmetric divergences do not permit an analytic policy as regularization and can incur numerical issues as loss. We tackle these challenges by the Taylor series of fff-divergence. Specifically, we prove that an analytic policy can be obtained with a finite series. For loss, we observe that symmetric divergences can be decomposed into an asymmetry and a conditional symmetry term, Taylor-expanding the latter alleviates numerical issues. Summing together, we propose Symmetric fff Actor-Critic (Sfff-AC), the first practical BRPO algorithm with symmetric divergences. Experimental results on distribution approximation and MuJoCo verify that Sfff-AC performs competitively.

View on arXiv
Comments on this paper