ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.10961
9
0

Let the Trial Begin: A Mock-Court Approach to Vulnerability Detection using LLM-Based Agents

16 May 2025
Ratnadira Widyasari
Martin Weyssow
Ivana Clairine Irsan
Han Wei Ang
Frank Liauw
Eng Lieh Ouh
Lwin Khin Shar
Hong Jin Kang
David Lo
    LLMAG
ArXivPDFHTML
Abstract

Detecting vulnerabilities in source code remains a critical yet challenging task, especially when benign and vulnerable functions share significant similarities. In this work, we introduce VulTrial, a courtroom-inspired multi-agent framework designed to enhance automated vulnerability detection. It employs four role-specific agents, which are security researcher, code author, moderator, and review board. Through extensive experiments using GPT-3.5 and GPT-4o we demonstrate that Vultrial outperforms single-agent and multi-agent baselines. Using GPT-4o, VulTrial improves the performance by 102.39% and 84.17% over its respective baseline. Additionally, we show that role-specific instruction tuning in multi-agent with small data (50 pair samples) improves the performance of VulTrial further by 139.89% and 118.30%. Furthermore, we analyze the impact of increasing the number of agent interactions on VulTrial's overall performance. While multi-agent setups inherently incur higher costs due to increased token usage, our findings reveal that applying VulTrial to a cost-effective model like GPT-3.5 can improve its performance by 69.89% compared to GPT-4o in a single-agent setting, at a lower overall cost.

View on arXiv
@article{widyasari2025_2505.10961,
  title={ Let the Trial Begin: A Mock-Court Approach to Vulnerability Detection using LLM-Based Agents },
  author={ Ratnadira Widyasari and Martin Weyssow and Ivana Clairine Irsan and Han Wei Ang and Frank Liauw and Eng Lieh Ouh and Lwin Khin Shar and Hong Jin Kang and David Lo },
  journal={arXiv preprint arXiv:2505.10961},
  year={ 2025 }
}
Comments on this paper