Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2408.14354
Cited By
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java
26 August 2024
Daoguang Zan
Zhirong Huang
Ailun Yu
Shaoxin Lin
Yifan Shi
Wei Liu
Dong Chen
Zongshuai Qi
Hao Yu
Lei Yu
Dezhi Ran
Muhan Zeng
Bo Shen
Pan Bian
Guangtai Liang
Bei Guan
Pengjie Huang
Tao Xie
Yongji Wang
Qianxiang Wang
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (42 upvotes)
Papers citing
"SWE-bench-java: A GitHub Issue Resolving Benchmark for Java"
8 / 8 papers shown
Title
From CVE Entries to Verifiable Exploits: An Automated Multi-Agent Framework for Reproducing CVEs
Saad Ullah
Praneeth Balasubramanian
Wenbo Guo
Amanda Burnett
Hammond Pearce
Christopher Kruegel
Giovanni Vigna
Gianluca Stringhini
121
3
0
01 Sep 2025
SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection
Lei Yu
Shiqi Cheng
Zhirong Huang
Jingyuan Zhang
Chenjie Shen
Junyi Lu
Li Yang
Fengjun Zhang
Jiajia Ma
AAML
85
2
0
30 Jul 2025
Evolutionary Perspectives on the Evaluation of LLM-Based AI Agents: A Comprehensive Survey
Jiachen Zhu
Menghui Zhu
Renting Rui
Rong Shan
Congmin Zheng
...
Jianghao Lin
Weiwen Liu
Ruiming Tang
Yong Yu
Weinan Zhang
LLMAG
ELM
210
6
0
06 Jun 2025
SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Wendong Xu
Jing Xiong
Chenyang Zhao
Qiujiang Chen
Haoran Wang
...
Hongxia Yang
Bei Yu
Lingpeng Kong
Q. Gu
Ngai Wong
LRM
121
2
0
29 May 2025
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks
Hongyuan Tao
Ying Zhang
Zhenhao Tang
Hongen Peng
Xukun Zhu
...
Linchao Zhu
Rui Wang
Hang Yu
Jianguo Li
Peng Di
273
10
0
22 May 2025
Frontier AI's Impact on the Cybersecurity Landscape
Wenbo Guo
Wenbo Guo
Tianneng Shi
Yu Yang
Andy Zhang
Patrick Gage Kelley
Kurt Thomas
Dawn Song
Dawn Song
375
18
0
07 Apr 2025
Survey on Evaluation of LLM-based Agents
Asaf Yehudai
Lilach Eden
Alan Li
Guy Uziel
Yilun Zhao
Roy Bar-Haim
Arman Cohan
Michal Shmueli-Scheuer
LLMAG
ELM
413
62
0
20 Mar 2025
Programming with Pixels: Can Computer-Use Agents do Software Engineering?
Pranjal Aggarwal
Sean Welleck
294
1
0
24 Feb 2025
1