Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.13168
Cited By
SciCode: A Research Coding Benchmark Curated by Scientists
18 July 2024
Minyang Tian
Luyu Gao
Shizhuo Dylan Zhang
Xinan Chen
Cunwei Fan
Xuefei Guo
Roland Haas
Pan Ji
K. Krongchon
Yao Li
Shengyan Liu
Di Luo
Yutao Ma
Hao Tong
Kha Trinh
Chenyu Tian
Zihan Wang
Bohao Wu
Yanyu Xiong
Shengzhu Yin
Min Zhu
K. Lieret
Yanxin Lu
Genglin Liu
Yufeng Du
Tianhua Tao
Ofir Press
Jamie Callan
Eliu A. Huerta
Hao Peng
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SciCode: A Research Coding Benchmark Curated by Scientists"
10 / 10 papers shown
Title
CodePDE: An Inference Framework for LLM-driven PDE Solver Generation
Shanda Li
Tanya Marwah
Junhong Shen
W. Sun
Andrej Risteski
Yiming Yang
Ameet Talwalkar
AI4CE
17
0
0
13 May 2025
ResearchCodeAgent: An LLM Multi-Agent System for Automated Codification of Research Methodologies
Shubham Gandhi
Dhruv Shah
Manasi S. Patwardhan
L. Vig
Gautam M. Shroff
LLMAG
AI4CE
62
0
0
28 Apr 2025
Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents
Shuo Ren
Pu Jian
Zhenjiang Ren
Chunlin Leng
Can Xie
Jiajun Zhang
LLMAG
AI4CE
57
0
0
31 Mar 2025
SWE-Lancer: Can Frontier LLMs Earn
1
M
i
l
l
i
o
n
f
r
o
m
R
e
a
l
−
W
o
r
l
d
F
r
e
e
l
a
n
c
e
S
o
f
t
w
a
r
e
E
n
g
i
n
e
e
r
i
n
g
?
1 Million from Real-World Freelance Software Engineering?
1
M
i
ll
i
o
n
f
ro
m
R
e
a
l
−
W
or
l
d
F
ree
l
an
ce
S
o
f
tw
a
re
E
n
g
in
eer
in
g
?
Samuel Miserendino
M. Wang
Tejal Patwardhan
Johannes Heidecke
41
17
0
17 Feb 2025
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Zachary S. Siegel
Sayash Kapoor
Nitya Nagdir
Benedikt Stroebl
Arvind Narayanan
27
8
0
17 Sep 2024
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
125
137
0
19 Sep 2023
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
ELM
ALM
178
780
0
02 May 2023
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
204
1,451
0
02 Sep 2021
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
194
614
0
20 May 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1