36

SweRank+: Multilingual, Multi-Turn Code Ranking for Software Issue Localization

Revanth Gangi Reddy
Ye Liu
Wenting Zhao
JaeHyeok Doo
Tarun Suresh
Daniel Lee
Caiming Xiong
Yingbo Zhou
Semih Yavuz
Shafiq Joty
Abstract

Maintaining large-scale, multilingual codebases hinges on accurately localizing issues, which requires mapping natural-language error descriptions to the relevant functions that need to be modified. However, existing ranking approaches are often Python-centric and perform a single-pass search over the codebase. This work introduces SweRank+, a framework that couples SweRankMulti, a cross-lingual code ranking tool, with SweRankAgent, an agentic search setup, for iterative, multi-turn reasoning over the code repository. SweRankMulti comprises a code embedding retriever and a listwise LLM reranker, and is trained using a carefully curated large-scale issue localization dataset spanning multiple popular programming languages. SweRankAgent adopts an agentic search loop that moves beyond single-shot localization with a memory buffer to reason and accumulate relevant localization candidates over multiple turns. Our experiments on issue localization benchmarks spanning various languages demonstrate new state-of-the-art performance with SweRankMulti, while SweRankAgent further improves localization over single-pass ranking.

View on arXiv
Main:8 Pages
5 Figures
Bibliography:2 Pages
4 Tables
Appendix:1 Pages
Comments on this paper