Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.02806
Cited By
The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers
3 April 2024
Hussein Mozannar
Valerie Chen
Mohammed Alsobay
Subhro Das
Sebastian Zhao
Dennis L. Wei
Manish Nagireddy
P. Sattigeri
Ameet Talwalkar
David Sontag
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers"
8 / 8 papers shown
Title
How Accurately Do Large Language Models Understand Code?
Sabaat Haroon
Ahmad Faraz Khan
Ahmad Humayun
Waris Gill
Abdul Haddi Amjad
A. R. Butt
Mohammad Taha Khan
Muhammad Ali Gulzar
ELM
LRM
30
0
0
06 Apr 2025
SPHERE: An Evaluation Card for Human-AI Systems
Qianou Ma
Dora Zhao
Xinran Zhao
Chenglei Si
Chenyang Yang
Ryan Louie
Ehud Reiter
Diyi Yang
Tongshuang Wu
ALM
50
0
0
24 Mar 2025
How Do Analysts Understand and Verify AI-Assisted Data Analyses?
Ken Gu
Ruoxi Shang
Tim Althoff
Chenglong Wang
Steven Drucker
AAML
33
25
0
19 Sep 2023
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
ELM
ALM
172
780
0
02 May 2023
Aligning Offline Metrics and Human Judgments of Value for Code Generation Models
Victor C. Dibia
Adam Fourney
Gagan Bansal
Forough Poursabzi-Sangdeh
Han Liu
Saleema Amershi
ALM
OffRL
33
12
0
29 Oct 2022
Grounded Copilot: How Programmers Interact with Code-Generating Models
Shraddha Barke
M. James
Nadia Polikarpova
136
212
0
30 Jun 2022
Productivity Assessment of Neural Code Completion
Albert Ziegler
Eirini Kalliamvakou
Shawn Simister
Ganesh Sittampalam
Alice Li
Andrew Rice
Devon Rifkin
E. Aftandilian
102
176
0
13 May 2022
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
190
853
0
09 Feb 2021
1