ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.00566
  4. Cited By
CodeBenchGen: Creating Scalable Execution-based Code Generation
  Benchmarks
v1v2v3 (latest)

CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks

31 March 2024
Yiqing Xie
Alex Xie
Divyanshu Sheth
Pengfei Liu
Daniel Fried
Carolyn Rose
ArXiv (abs)PDFHTMLGithub (8★)

Papers citing "CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks"

12 / 12 papers shown
Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
Jun Wang
Ninglun Gu
Kailai Zhang
Zijiao Zhang
Yelun Bao
...
Liwei Liu
Yihuan Liu
Pengyong Li
Gary G. Yen
Junchi Yan
ALMELM
216
0
0
26 Aug 2025
CrossPL: Evaluating Large Language Models on Cross Programming Language Code Generation
CrossPL: Evaluating Large Language Models on Cross Programming Language Code Generation
Zhanhang Xiong
Dongxia Wang
Yuekang Li
Xinyuan An
Wenhai Wang
141
0
0
26 Jul 2025
CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning
CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning
Monoshi Kumar Roy
Simin Chen
Benjamin Steenhoek
Jinjun Peng
Gail E. Kaiser
Baishakhi Ray
Wei Le
LRM
242
4
0
31 May 2025
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
Manish Shetty
Naman Jain
Jinjian Liu
Vijay Kethanaboyina
Koushik Sen
Ion Stoica
ELM
269
10
0
29 May 2025
Large Language Models for IT Automation Tasks: Are We There Yet?
Large Language Models for IT Automation Tasks: Are We There Yet?
Md Mahadi Hassan
John Salvador
Akond Rahman
S. Karmaker
189
1
0
26 May 2025
Towards an Understanding of Context Utilization in Code Intelligence
Towards an Understanding of Context Utilization in Code Intelligence
Yanlin Wang
Kefeng Duan
Dewu Zheng
Ensheng Shi
F. Zhang
...
Xilin Liu
Yuchi Ma
Hongyu Zhang
Qianxiang Wang
Zibin Zheng
256
3
0
11 Apr 2025
ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation
ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Kaiyuan Liu
Youcheng Pan
Junlin Li
Daojing He
Yang Xiang
Yexing Du
Tianrun Gao
ELMLLMAG
304
9
0
10 Mar 2025
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
Xin Zhou
Martin Weyssow
Ratnadira Widyasari
Ting Zhang
Junda He
Yunbo Lyu
Jianming Chang
Beiqi Zhang
Dan Huang
David Lo
PILM
994
26
0
10 Feb 2025
DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code
  Generation
DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code GenerationAAAI Conference on Artificial Intelligence (AAAI), 2024
Qiming Zhu
Jialun Cao
Yaojie Lu
Hongyu Lin
Xianpei Han
Le Sun
Shing-Chi Cheung
ALM
145
18
0
23 Aug 2024
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
Zeyu Leo Liu
Shrey Pandit
Xi Ye
Eunsol Choi
Greg Durrett
KELMALM
395
13
0
08 Jul 2024
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo
Minh Chien Vu
Jenny Chim
Han Hu
Wenhao Yu
...
David Lo
Daniel Fried
Xiaoning Du
H. D. Vries
Leandro von Werra
603
371
0
22 Jun 2024
Benchmarks and Metrics for Evaluations of Code Generation: A Critical
  Review
Benchmarks and Metrics for Evaluations of Code Generation: A Critical ReviewInternational Conference on Artificial Intelligence Testing (ICAIT), 2024
Debalina Ghosh Paul
Hong Zhu
Ian Bayley
ALMELM
186
31
0
18 Jun 2024
1