19
v1v2 (latest)

SoK: DARPA's AI Cyber Challenge (AIxCC): Competition Design, Architectures, and Lessons Learned

Cen Zhang
Younggi Park
Fabian Fleischer
Yu-Fu Fu
Jiho Kim
Dongkwan Kim
Youngjoon Kim
Qingxiao Xu
Andrew Chin
Ze Sheng
Hanqing Zhao
Brian J. Lee
Joshua Wang
Michael Pelican
David J. Musliner
Jeff Huang
Jon Silliman
Mikel Mcdaniel
Jefferson Casavant
Isaac Goldthwaite
Nicholas Vidovich
Matthew Lehman
Taesoo Kim
Main:12 Pages
10 Figures
Bibliography:3 Pages
23 Tables
Appendix:11 Pages
Abstract

DARPA's AI Cyber Challenge (AIxCC, 2023--2025) is the largest competition to date for building fully autonomous cyber reasoning systems (CRSs) that leverage recent advances in AI -- particularly large language models (LLMs) -- to discover and remediate vulnerabilities in real-world open-source software. This paper presents the first systematic analysis of AIxCC. Drawing on design documents, source code, execution traces, and discussions with organizers and competing teams, we examine the competition's structure and key design decisions, characterize the architectural approaches of finalist CRSs, and analyze competition results beyond the final scoreboard. Our analysis reveals the factors that truly drove CRS performance, identifies genuine technical advances achieved by teams, and exposes limitations that remain open for future research. We conclude with lessons for organizing future competitions and broader insights toward deploying autonomous CRSs in practice.

View on arXiv
Comments on this paper