ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.14898
  4. Cited By
InterCode: Standardizing and Benchmarking Interactive Coding with
  Execution Feedback

InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback

26 June 2023
John Yang
Akshara Prabhakar
Karthik Narasimhan
Shunyu Yao
ArXivPDFHTML

Papers citing "InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback"

19 / 19 papers shown
Title
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
Vishnu Sarukkai
Zhiqiang Xie
Kayvon Fatahalian
LLMAG
68
0
0
01 May 2025
CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation
CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation
Sizhe Wang
Z. Wang
Dongsheng Ma
Yongan Yu
Rui Ling
Z. Li
Feiyu Xiong
W. Zhang
LRM
50
0
0
30 Apr 2025
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
Akshara Prabhakar
Z. Liu
Weiran Yao
Jianguo Zhang
Ming Zhu
...
Juan Carlos Niebles
Shelby Heinecke
H. Wang
S.
Caiming Xiong
VGen
74
1
0
04 Apr 2025
A Framework for Evaluating Emerging Cyberattack Capabilities of AI
A Framework for Evaluating Emerging Cyberattack Capabilities of AI
Mikel Rodriguez
Raluca Ada Popa
Four Flynn
Lihao Liang
Allan Dafoe
Anna Wang
ELM
53
2
0
14 Mar 2025
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration
Minghang Deng
Ashwin Ramachandran
Canwen Xu
Lanxiang Hu
Zhewei Yao
Anupam Datta
Hao Zhang
LMTD
110
1
0
02 Feb 2025
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Fangyu Lei
Jixuan Chen
Yuxiao Ye
Ruisheng Cao
Dongchan Shin
...
Caiming Xiong
Ruoxi Sun
Qian Liu
Sida I. Wang
Tao Yu
LMTD
71
20
0
12 Nov 2024
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+
  Interaction Trajectories
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Yifan Song
Weimin Xiong
Xiutian Zhao
Dawei Zhu
Wenhao Wu
Ke Wang
Cheng Li
Wei Peng
Sujian Li
LLMAG
11
9
0
10 Oct 2024
ScriptSmith: A Unified LLM Framework for Enhancing IT Operations via
  Automated Bash Script Generation, Assessment, and Refinement
ScriptSmith: A Unified LLM Framework for Enhancing IT Operations via Automated Bash Script Generation, Assessment, and Refinement
Oishik Chatterjee
Pooja Aggarwal
Suranjana Samanta
Ting Dai
P. Mohapatra
...
Ruchi Mahindru
Steve Barbieri
Eugen Postea
Brad Blancett
Arthur De Magalhaes
13
1
0
12 Sep 2024
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating
  Automated Scientific Discovery Agents
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Peter Alexander Jansen
Marc-Alexandre Côté
Tushar Khot
Erin Bransom
Bhavana Dalvi Mishra
Bodhisattwa Prasad Majumder
Oyvind Tafjord
Peter Clark
LLMAG
27
21
0
10 Jun 2024
Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit
  Distance
Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance
Yewei Song
Cedric Lothritz
Daniel Tang
Tegawende F. Bissyande
Jacques Klein
27
9
0
12 Apr 2024
AI capabilities can be significantly improved without expensive
  retraining
AI capabilities can be significantly improved without expensive retraining
Tom Davidson
Jean-Stanislas Denain
Pablo Villalobos
Guillem Bas
OffRL
VLM
8
26
0
12 Dec 2023
ADaPT: As-Needed Decomposition and Planning with Language Models
ADaPT: As-Needed Decomposition and Planning with Language Models
Archiki Prasad
Alexander Koller
Mareike Hartmann
Peter Clark
Ashish Sabharwal
Mohit Bansal
Tushar Khot
LM&Ro
10
74
0
08 Nov 2023
Cognitive Architectures for Language Agents
Cognitive Architectures for Language Agents
T. Sumers
Shunyu Yao
Karthik Narasimhan
Thomas L. Griffiths
LLMAG
LM&Ro
25
150
0
05 Sep 2023
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
208
2,413
0
06 Oct 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
S. Hoi
SyDa
ALM
118
232
0
05 Jul 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for
  Code Understanding and Generation
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
196
1,451
0
02 Sep 2021
Measuring Coding Challenge Competence With APPS
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
189
614
0
20 May 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding
  and Generation
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
183
1,098
0
09 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
1