InterCode: Standardizing and Benchmarking Interactive Coding with
Execution Feedback

InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback

26 June 2023

Akshara Prabhakar

Karthik Narasimhan

Papers citing "InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback"

19 / 19 papers shown

Title
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks Vishnu Sarukkai Zhiqiang Xie Kayvon Fatahalian LLMAG 68 0 0 01 May 2025
CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation Sizhe Wang Z. Wang Dongsheng Ma Yongan Yu Rui Ling Z. Li Feiyu Xiong W. Zhang LRM 50 0 0 30 Apr 2025
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay Akshara Prabhakar Z. Liu Weiran Yao Jianguo Zhang Ming Zhu ... Juan Carlos Niebles Shelby Heinecke H. Wang S. Caiming Xiong VGen 74 1 0 04 Apr 2025
A Framework for Evaluating Emerging Cyberattack Capabilities of AI Mikel Rodriguez Raluca Ada Popa Four Flynn Lihao Liang Allan Dafoe Anna Wang ELM 53 2 0 14 Mar 2025
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration Minghang Deng Ashwin Ramachandran Canwen Xu Lanxiang Hu Zhewei Yao Anupam Datta Hao Zhang LMTD 110 1 0 02 Feb 2025
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows Fangyu Lei Jixuan Chen Yuxiao Ye Ruisheng Cao Dongchan Shin ... Caiming Xiong Ruoxi Sun Qian Liu Sida I. Wang Tao Yu LMTD 71 20 0 12 Nov 2024
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories Yifan Song Weimin Xiong Xiutian Zhao Dawei Zhu Wenhao Wu Ke Wang Cheng Li Wei Peng Sujian Li LLMAG 11 9 0 10 Oct 2024
ScriptSmith: A Unified LLM Framework for Enhancing IT Operations via Automated Bash Script Generation, Assessment, and Refinement Oishik Chatterjee Pooja Aggarwal Suranjana Samanta Ting Dai P. Mohapatra ... Ruchi Mahindru Steve Barbieri Eugen Postea Brad Blancett Arthur De Magalhaes 13 1 0 12 Sep 2024
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents Peter Alexander Jansen Marc-Alexandre Côté Tushar Khot Erin Bransom Bhavana Dalvi Mishra Bodhisattwa Prasad Majumder Oyvind Tafjord Peter Clark LLMAG 27 21 0 10 Jun 2024
Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance Yewei Song Cedric Lothritz Daniel Tang Tegawende F. Bissyande Jacques Klein 27 9 0 12 Apr 2024
AI capabilities can be significantly improved without expensive retraining Tom Davidson Jean-Stanislas Denain Pablo Villalobos Guillem Bas OffRL VLM 8 26 0 12 Dec 2023
ADaPT: As-Needed Decomposition and Planning with Language Models Archiki Prasad Alexander Koller Mareike Hartmann Peter Clark Ashish Sabharwal Mohit Bansal Tushar Khot LM&Ro 10 74 0 08 Nov 2023
Cognitive Architectures for Language Agents T. Sumers Shunyu Yao Karthik Narasimhan Thomas L. Griffiths LLMAG LM&Ro 25 150 0 05 Sep 2023
ReAct: Synergizing Reasoning and Acting in Language Models Shunyu Yao Jeffrey Zhao Dian Yu Nan Du Izhak Shafran Karthik Narasimhan Yuan Cao LLMAG ReLM LRM 208 2,413 0 06 Oct 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning Hung Le Yue Wang Akhilesh Deepak Gotmare Silvio Savarese S. Hoi SyDa ALM 118 232 0 05 Jul 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation Yue Wang Weishi Wang Shafiq R. Joty S. Hoi 196 1,451 0 02 Sep 2021
Measuring Coding Challenge Competence With APPS Dan Hendrycks Steven Basart Saurav Kadavath Mantas Mazeika Akul Arora ... Collin Burns Samir Puranik Horace He D. Song Jacob Steinhardt ELM AIMat ALM 189 614 0 20 May 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation Shuai Lu Daya Guo Shuo Ren Junjie Huang Alexey Svyatkovskiy ... Nan Duan Neel Sundaresan Shao Kun Deng Shengyu Fu Shujie Liu ELM 183 1,098 0 09 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 236 1,508 0 31 Dec 2020