8

Autoregressive, Yet Revisable: In Decoding Revision for Secure Code Generation

Chengran Yang
Zichao Wei
Heminghao Deng
Jinfeng Jiang
Zhensu Sun
Ting Zhang
Tianyi Wu
Ming Wen
David Lo
Main:8 Pages
6 Figures
Bibliography:3 Pages
5 Tables
Appendix:10 Pages
Abstract

Large Language Model (LLM) based code generation is predominantly formulated as a strictly monotonic process, appending tokens linearly to an immutable prefix. This formulation contrasts to the cognitive process of programming, which is inherently interleaved with forward generation and on-the-fly revision. While prior works attempt to introduce revision via post-hoc agents or external static tools, they either suffer from high latency or fail to leverage the model's intrinsic semantic reasoning. In this paper, we propose Stream of Revision, a paradigm shift that elevates code generation from a monotonic stream to a dynamic, self-correcting trajectory by leveraging model's intrinsic capabilities. We introduce specific action tokens that enable the model to seamlessly backtrack and edit its own history within a single forward pass. By internalizing the revision loop, our framework Stream of Revision allows the model to activate its latent capabilities just-in-time without external dependencies. Empirical results on secure code generation show that Stream of Revision significantly reduces vulnerabilities with minimal inference overhead.

View on arXiv
Comments on this paper