ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.20127
57
4

SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning

27 February 2025
Zexiong Ma
Chao Peng
Pengfei Gao
Xiangxin Meng
Yanzhen Zou
Bing Xie
    MoMe
    OffRL
    LRM
ArXivPDFHTML
Abstract

Mainstream issue-resolving frameworks predominantly rely on commercial models, leading to high costs and privacy concerns. Existing training approaches for issue resolving struggle with poor generalization and fail to fully leverage open-source development resources. We propose Subtask-oriented Reinforced Fine-Tuning (SoRFT), a novel training approach to enhance the issue resolving capability of LLMs. We decomposes issue resolving into structured subtasks: file localization, function localization, line localization, and code edit generation. SoRFT consists of two training stages: (1) rejection-sampled supervised fine-tuning, Chain of Thought (CoT) data is filtered using ground-truth before fine-tuning the LLM, and (2) rule-based reinforcement learning, which leverages PPO with ground-truth based rewards. We evaluate the SoRFT-trained model on SWE-Bench Verified and SWE-Bench Lite, achieving state-of-the-art (SOTA) performance among open-source models (e.g., resolve 21.4% issues on SWE-Bench Verified with SoRFT-Qwen-7B). The experimental results demonstrate that SoRFT significantly enhances issue-resolving performance, improves model generalization, and provides a cost-efficient alternative to commercial models.

View on arXiv
@article{ma2025_2502.20127,
  title={ SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning },
  author={ Zexiong Ma and Chao Peng and Pengfei Gao and Xiangxin Meng and Yanzhen Zou and Bing Xie },
  journal={arXiv preprint arXiv:2502.20127},
  year={ 2025 }
}
Comments on this paper