Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.09896
Cited By
Is Self-Repair a Silver Bullet for Code Generation?
16 June 2023
Theo X. Olausson
J. Inala
Chenglong Wang
Jianfeng Gao
Armando Solar-Lezama
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is Self-Repair a Silver Bullet for Code Generation?"
50 / 75 papers shown
Title
CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation
Anirudh Khatry
Robert Zhang
Jia Pan
Ziteng Wang
Qiaochu Chen
Greg Durrett
Isil Dillig
32
0
0
21 Apr 2025
EvalAgent: Discovering Implicit Evaluation Criteria from the Web
Manya Wadhwa
Zayne Sprague
Chaitanya Malaviya
Philippe Laban
Junyi Jessy Li
Greg Durrett
25
0
0
21 Apr 2025
Weight Ensembling Improves Reasoning in Language Models
Xingyu Dang
Christina Baek
Kaiyue Wen
Zico Kolter
Aditi Raghunathan
MoMe
LRM
60
1
0
14 Apr 2025
Imperative vs. Declarative Programming Paradigms for Open-Universe Scene Generation
Maxim Gumin
Do Heon Han
Seung Jean Yoo
Aditya Ganeshan
R. K. Jones
Rio Aguina-Kang
Stewart Morris
Daniel E. Ritchie
23
0
0
07 Apr 2025
Beyond Accuracy: The Role of Calibration in Self-Improving Large Language Models
Liangjie Huang
Dawei Li
Huan Liu
Lu Cheng
LRM
34
0
0
03 Apr 2025
Attention-Aware Multi-View Pedestrian Tracking
Reef Alturki
Adrian Hilton
Jean-Yves Guillemaut
28
0
0
03 Apr 2025
debug-gym: A Text-Based Environment for Interactive Debugging
Xingdi Yuan
Morgane M Moss
Charbel El Feghali
Chinmay Singh
Darya Moldavskaya
...
Lucas Page-Caccia
Matheus Pereira
Minseon Kim
Alessandro Sordoni
Marc-Alexandre Côté
LLMAG
68
1
0
27 Mar 2025
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration
David Wan
Justin Chih-Yao Chen
Elias Stengel-Eskin
Mohit Bansal
LLMAG
LRM
60
1
0
19 Mar 2025
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts
Qin Liu
Wenxuan Zhou
Nan Xu
James Y. Huang
Fei-Yue Wang
Sheng Zhang
Hoifung Poon
M. Chen
LLMAG
ReLM
AI4Cl
LRM
87
1
0
17 Mar 2025
From Voice to Safety: Language AI Powered Pilot-ATC Communication Understanding for Airport Surface Movement Collision Risk Assessment
Yutian Pang
Andrew Paul Kendall
Alex Porcayo
Mariah Barsotti
Anahita Jain
John-Paul Clarke
46
0
0
06 Mar 2025
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair
Zaoyu Chen
Haoran Qin
Nuo Chen
Xiangyu Zhao
Lei Xue
Xiapu Luo
Xiao-Ming Wu
41
0
0
03 Mar 2025
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
Hojae Han
Seung-won Hwang
Rajhans Samdani
Yuxiong He
ALM
65
2
0
27 Feb 2025
Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation
Shiven Sinha
Shashwat Goel
Ponnurangam Kumaraguru
Jonas Geiping
Matthias Bethge
Ameya Prabhu
ReLM
ELM
LRM
124
0
0
26 Feb 2025
Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way
Chenglong Wang
Bongshin Lee
Steven Drucker
Dan Marshall
Jianfeng Gao
38
1
0
24 Feb 2025
Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection
Boyu Mi
Hanqing Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
67
0
0
21 Feb 2025
CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation
Kefan Li
Hongyue Yu
Tingyu Guo
Shijie Cao
Yuan Yuan
35
0
0
15 Feb 2025
LLMs can implicitly learn from mistakes in-context
Lisa Alazraki
Maximilian Mozes
Jon Ander Campos
Yi Chern Tan
Marek Rei
Max Bartolo
ReLM
LRM
88
0
0
12 Feb 2025
AuPair: Golden Example Pairs for Code Repair
Aditi Mavalankar
Hassan Mansoor
Zita Marinho
Masha Samsikova
Tom Schaul
KELM
LRM
58
0
0
12 Feb 2025
Learning to Generate Unit Tests for Automated Debugging
Archiki Prasad
Elias Stengel-Eskin
Justin Chih-Yao Chen
Zaid Khan
Mohit Bansal
ELM
76
1
0
03 Feb 2025
Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey
Junqiao Wang
Zeng Zhang
Yangfan He
Yuyang Song
Tianyu Shi
...
Hengyuan Xu
Kunyu Wu
Guangwu Qian
Qiuwu Chen
Lewei He
38
8
0
03 Jan 2025
PromptV: Leveraging LLM-powered Multi-Agent Prompting for High-quality Verilog Generation
Zhendong Mi
Renming Zheng
Haowen Zhong
Yue Sun
Shaoyi Huang
74
0
0
15 Dec 2024
Self-Explained Keywords Empower Large Language Models for Code Generation
Lishui Fan
Mouxiang Chen
Zhongxin Liu
38
1
0
21 Oct 2024
Automated Proof Generation for Rust Code via Self-Evolution
Tianyu Chen
Shuai Lu
Shan Lu
Y. Gong
Chenyuan Yang
...
Peng Cheng
Fan Yang
Shuvendu Lahiri
Tao Xie
Lidong Zhou
37
6
0
21 Oct 2024
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du
Jinyi Han
Yizhou Ying
Aili Chen
Qianyu He
...
Haoran Guo
Jiaqing Liang
Zulong Chen
Liangyue Li
Yanghua Xiao
KELM
CLL
LRM
25
1
0
17 Oct 2024
Divide-Verify-Refine: Can LLMs Self-Align with Complex Instructions?
Xianren Zhang
Xianfeng Tang
Hui Liu
Zongyu Wu
Qi He
Dongwon Lee
Suhang Wang
ALM
38
0
0
16 Oct 2024
Mastering the Craft of Data Synthesis for CodeLLMs
Meng Chen
Philip Arthur
Qianyu Feng
Cong Duy Vu Hoang
Yu-Heng Hong
...
Mark Johnson
K. K.
Don Dharmasiri
Long Duong
Yuan-Fang Li
SyDa
46
1
0
16 Oct 2024
RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Jonas Gehring
Kunhao Zheng
Jade Copet
Vegard Mella
Taco Cohen
Gabriel Synnaeve
LLMAG
27
20
0
02 Oct 2024
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance
Haolin Jin
Zechao Sun
Huaming Chen
LLMAG
43
2
0
02 Oct 2024
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
Yuling Shi
Songsong Wang
Chengcheng Wan
Xiaodong Gu
ELM
19
6
0
02 Oct 2024
Data Analysis in the Era of Generative AI
J. Inala
Chenglong Wang
Steven Drucker
Gonzalo Ramos
Victor C. Dibia
N. Riche
Dave Brown
Dan Marshall
Jianfeng Gao
20
6
0
27 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
93
79
0
18 Sep 2024
U
S
C
D
\mathbb{USCD}
USCD
: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding
Shuai Wang
Liang Ding
Li Shen
Yong Luo
Zheng He
Wei Yu
Dacheng Tao
30
0
0
09 Sep 2024
A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement
Huan Zhang
Wei Cheng
Yuhan Wu
Wei Hu
LLMAG
31
1
0
08 Sep 2024
Large Language Model-Based Agents for Software Engineering: A Survey
Junwei Liu
Kaixin Wang
Yixuan Chen
Xin Peng
Zhenpeng Chen
Lingming Zhang
Yiling Lou
AI4CE
LLMAG
LM&Ro
42
36
0
04 Sep 2024
COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis
Weiqing Yang
Hanbin Wang
Zhenghao Liu
Xinze Li
Yukun Yan
Shuo Wang
Yu Gu
Minghe Yu
Zhiyuan Liu
Ge Yu
38
2
0
09 Aug 2024
WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization
Liwenhan Xie
Chengbo Zheng
Haijun Xia
Huamin Qu
Zhu-Tian Chen
LLMAG
26
10
0
03 Aug 2024
Effective Large Language Model Debugging with Best-first Tree Search
Jialin Song
Jonathan Raiman
Bryan Catanzaro
LRM
33
0
0
26 Jul 2024
Learning to Refine with Fine-Grained Natural Language Feedback
Manya Wadhwa
Xinyu Zhao
Junyi Jessy Li
Greg Durrett
18
11
0
02 Jul 2024
Agentless: Demystifying LLM-based Software Engineering Agents
Chunqiu Steven Xia
Yinlin Deng
Soren Dunn
Lingming Zhang
LLMAG
32
78
0
01 Jul 2024
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models
Sean Welleck
Amanda Bertsch
Matthew Finlayson
Hailey Schoelkopf
Alex Xie
Graham Neubig
Ilia Kulikov
Zaid Harchaoui
33
45
0
24 Jun 2024
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
Hung Le
Yingbo Zhou
Caiming Xiong
Silvio Savarese
Doyen Sahoo
43
2
0
23 Jun 2024
WebCanvas: Benchmarking Web Agents in Online Environments
Yichen Pan
Dehan Kong
Sida Zhou
Cheng Cui
Yifei Leng
...
Hangyu Liu
Yanyi Shang
Shuyan Zhou
Tongshuang Wu
Zhengyang Wu
21
26
0
18 Jun 2024
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
Joykirat Singh
A. Nambi
Vibhav Vineet
LRM
27
5
0
16 Jun 2024
Is Programming by Example solved by LLMs?
Wen-Ding Li
Kevin Ellis
24
9
0
12 Jun 2024
Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation
Nachiket Kotalwar
Alkis Gotovos
Adish Singla
ALM
42
4
0
07 Jun 2024
Synthetic Programming Elicitation and Repair for Text-to-Code in Very Low-Resource Programming Languages
Federico Mora
Justin Wong
Haley Lepe
Sahil Bhatia
Karim Elmaaroufi
George Varghese
Joseph E. Gonzalez
Elizabeth Polgreen
S. Seshia
SyDa
19
3
0
05 Jun 2024
Learning to Edit Visual Programs with Self-Supervision
R. K. Jones
Renhao Zhang
Aditya Ganeshan
Daniel E. Ritchie
SSL
24
3
0
04 Jun 2024
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept
Guangliang Liu
Haitao Mao
Bochuan Cao
Zhiyu Xue
K. Johnson
Jiliang Tang
Rongrong Wang
LRM
24
9
0
04 Jun 2024
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
Ryo Kamoi
Yusen Zhang
Nan Zhang
Jiawei Han
Rui Zhang
LRM
40
57
0
03 Jun 2024
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
Hao Tang
Keya Hu
Jin Peng Zhou
Sicheng Zhong
Wei-Long Zheng
Xujie Si
Kevin Ellis
16
13
0
26 May 2024
1
2
Next