Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2507.04952
Cited By
v1
v2 (latest)
ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation
7 July 2025
Chenchen Zhang
Yuhang Li
Can Xu
Jiaheng Liu
Ao Liu
Changzhi Zhou
K. Deng
Dengpeng Wu
Guanhua Huang
K. Li
Qi Yi
Ruibin Xiong
Shihui Hu
Yue Zhang
Yuhao Jiang
Zenan Xu
Yuanxing Zhang
Wiggin Zhou
Chayse Zhou
Fengzong Lian
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (8 upvotes)
Papers citing
"ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation"
6 / 6 papers shown
EWE: An Agentic Framework for Extreme Weather Analysis
Zhe Jiang
Jiong Wang
Xiaoyu Yue
Zijie Guo
Wenlong Zhang
Fenghua Ling
Wanli Ouyang
L. Bai
164
1
0
26 Nov 2025
SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models
Jingxuan Xu
K. Deng
W. Li
Songwei Yu
Huaixi Tang
...
Zhaoxiang Zhang
Yuqun Zhang
H. Zhang
Bin Chen
Jiaheng Liu
ELM
351
1
0
07 Nov 2025
VinciCoder: Unifying Multimodal Code Generation via Coarse-to-fine Visual Reinforcement Learning
Xuanle Zhao
Deyang Jiang
Zhixiong Zeng
Lei Chen
Haibo Qiu
Jing Huang
Yufeng Zhong
Liming Zheng
Yilin Cao
Lin Ma
140
2
0
01 Nov 2025
ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding
Yuhang Li
Chenchen Zhang
Ruilin Lv
Ao Liu
K. Deng
Yuanxing Zhang
Jiaheng Liu
Wiggin Zhou
B. Zhou
LRM
103
3
0
13 Oct 2025
InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation
Qiaosheng Chen
Y. Liu
Lei Li
Kai Chen
Q. Guo
Gong Cheng
Fei Yuan
ELM
153
1
0
10 Oct 2025
AIM-Bench: Evaluating Decision-making Biases of Agentic LLM as Inventory Manager
Xuhua Zhao
Yuxuan Xie
Caihua Chen
Yuxiang Sun
65
0
0
15 Aug 2025
1