Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.12655
Cited By
Benchmarks and Metrics for Evaluations of Code Generation: A Critical Review
18 June 2024
Debalina Ghosh Paul
Hong Zhu
Ian Bayley
ALM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Benchmarks and Metrics for Evaluations of Code Generation: A Critical Review"
2 / 2 papers shown
Title
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
ELM
ALM
163
388
0
02 May 2023
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
189
614
0
20 May 2021
1