ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.15641
  4. Cited By
PRE: A Peer Review Based Large Language Model Evaluator

PRE: A Peer Review Based Large Language Model Evaluator

28 January 2024
Zhumin Chu
Qingyao Ai
Yiteng Tu
Haitao Li
Yiqun Liu
    LRM
    ALM
ArXivPDFHTML

Papers citing "PRE: A Peer Review Based Large Language Model Evaluator"

17 / 17 papers shown
Title
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
0
0
26 Apr 2025
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang
Munan Ning
Zheyuan Liu
Yanbo Wang
Jiayi Ye
Yue Huang
Shuo Yang
Xiao Chen
Y. Song
Li Yuan
LRM
56
0
0
19 Mar 2025
DAFE: LLM-Based Evaluation Through Dynamic Arbitration for Free-Form Question-Answering
Sher Badshah
Hassan Sajjad
60
1
0
11 Mar 2025
LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation
LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation
Haitao Li
Y. Chen
Yiran Hu
Qingyao Ai
Junjie Chen
Xiaoyu Yang
J. Yang
Yueyue Wu
Zeyang Liu
Y. Liu
AILaw
RALM
ELM
59
0
0
28 Feb 2025
PiCO: Peer Review in LLMs based on the Consistency Optimization
PiCO: Peer Review in LLMs based on the Consistency Optimization
Kun-Peng Ning
Shuo Yang
Yu-Yang Liu
Jia-Yu Yao
Zhen-Hui Liu
Yu Wang
Ming Pang
Li Yuan
ALM
63
8
0
24 Feb 2025
LegalAgentBench: Evaluating LLM Agents in Legal Domain
LegalAgentBench: Evaluating LLM Agents in Legal Domain
H. Li
Junjie Chen
Jingli Yang
Qingyao Ai
Wei Jia
...
Guozhi Yuan
Yiran Hu
Wuyue Wang
Y. Liu
Minlie Huang
LLMAG
AILaw
ELM
48
11
0
23 Dec 2024
AI Predicts AGI: Leveraging AGI Forecasting and Peer Review to Explore LLMs' Complex Reasoning Capabilities
AI Predicts AGI: Leveraging AGI Forecasting and Peer Review to Explore LLMs' Complex Reasoning Capabilities
Fabrizio Davide
Pietro Torre
Andrea Gaggioli
Andrea Gaggioli
ELM
88
0
0
12 Dec 2024
CalibraEval: Calibrating Prediction Distribution to Mitigate Selection
  Bias in LLMs-as-Judges
CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges
Haitao Li
Junjie Chen
Qingyao Ai
Zhumin Chu
Yujia Zhou
Qian Dong
Yiqun Liu
32
8
0
20 Oct 2024
An Automatic and Cost-Efficient Peer-Review Framework for Language
  Generation Evaluation
An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation
Junjie Chen
Weihang Su
Zhumin Chu
Haitao Li
Qinyao Ai
Yiqun Liu
Min Zhang
Shaoping Ma
19
3
0
16 Oct 2024
LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large
  Language Models
LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
Haitao Li
You Chen
Qingyao Ai
Yueyue Wu
Ruizhe Zhang
Yiqun Liu
ALM
AILaw
ELM
41
8
0
30 Sep 2024
Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks
Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks
Justin Zhao
Flor Miriam Plaza del Arco
A. C. Curry
Amanda Cercas Curry
ELM
ALM
30
1
0
12 Jun 2024
Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles
  and Committee Discussions
Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions
Ruochen Zhao
Wenxuan Zhang
Yew Ken Chia
Deli Zhao
Lidong Bing
27
9
0
30 May 2024
BLADE: Enhancing Black-box Large Language Models with Small
  Domain-Specific Models
BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Haitao Li
Qingyao Ai
Jia Chen
Qian Dong
Zhijing Wu
Yiqun Liu
Chong Chen
Qi Tian
AILaw
45
13
0
27 Mar 2024
Large Language Models for Data Annotation: A Survey
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Jundong Li
Lu Cheng
Huan Liu
SyDa
42
44
0
21 Feb 2024
TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for
  Human-Aligned LLMs
TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs
Shuyi Xie
Wenlin Yao
Yong Dai
Shaobo Wang
Donlin Zhou
...
Zhichao Hu
Dong Yu
Zhengyou Zhang
Jing Nie
Yuhong Liu
ELM
ALM
11
4
0
09 Nov 2023
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
240
1,070
0
05 Oct 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1