Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.09835
Cited By
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code
16 November 2023
Xiangru Tang
Yuliang Liu
Zefan Cai
Yan Shao
Junjie Lu
Yichi Zhang
Zexuan Deng
Helan Hu
Kaikai An
Ruijun Huang
Shuzheng Si
Sheng Chen
Haozhe Zhao
Liang Chen
Yan Wang
Tianyu Liu
Zhiwei Jiang
Baobao Chang
Yiming Zong
Yujia Qin
Wangchunshu Zhou
Yilun Zhao
Arman Cohan
Mark B. Gerstein
ELM
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code"
4 / 4 papers shown
Title
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Minju Seo
Jinheon Baek
Seongyun Lee
S. Hwang
AI4CE
35
0
0
24 Apr 2025
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
John Yang
Carlos E. Jimenez
Alexander Wettig
K. Lieret
Shunyu Yao
Karthik Narasimhan
Ofir Press
LLMAG
99
188
0
06 May 2024
LLM vs. Lawyers: Identifying a Subset of Summary Judgments in a Large UK Case Law Dataset
Ahmed Izzidien
Holli Sargeant
Felix Steffek
AILaw
ELM
29
7
0
04 Mar 2024
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
194
614
0
20 May 2021
1