Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.20267
Cited By
Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions
30 May 2024
Ruochen Zhao
Wenxuan Zhang
Yew Ken Chia
Deli Zhao
Lidong Bing
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions"
5 / 5 papers shown
Title
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
0
0
26 Apr 2025
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
Huawen Feng
Pu Zhao
Qingfeng Sun
Can Xu
Fangkai Yang
...
Qianli Ma
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
AAML
ALM
62
0
0
23 Dec 2024
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
108
61
0
25 Nov 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
298
0
05 Jan 2024
Large Language Models are Diverse Role-Players for Summarization Evaluation
Ning Wu
Ming Gong
Linjun Shou
Shining Liang
Daxin Jiang
57
44
0
27 Mar 2023
1