ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17534
86
0

Co-Reinforcement Learning for Unified Multimodal Understanding and Generation

23 May 2025
Jingjing Jiang
Chongjie Si
Jun Luo
Hanwang Zhang
Chao Ma
ArXivPDFHTML
Abstract

This paper presents a pioneering exploration of reinforcement learning (RL) via group relative policy optimization for unified multimodal large language models (ULMs), aimed at simultaneously reinforcing generation and understanding capabilities. Through systematic pilot studies, we uncover the significant potential of ULMs to enable the synergistic co-evolution of dual capabilities within a shared policy optimization framework. Building on this insight, we introduce CoRL, a co-reinforcement learning framework comprising a unified RL stage for joint optimization and a refined RL stage for task-specific enhancement. With the proposed CoRL, our resulting model, ULM-R1, achieves average improvements of 7% on three text-to-image generation datasets and 23% on nine multimodal understanding benchmarks. These results demonstrate the effectiveness of CoRL and highlight the substantial benefit of reinforcement learning in facilitating cross-task synergy and optimization for ULMs. Code is available atthis https URL.

View on arXiv
@article{jiang2025_2505.17534,
  title={ Co-Reinforcement Learning for Unified Multimodal Understanding and Generation },
  author={ Jingjing Jiang and Chongjie Si and Jun Luo and Hanwang Zhang and Chao Ma },
  journal={arXiv preprint arXiv:2505.17534},
  year={ 2025 }
}
Comments on this paper