ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.01862
  4. Cited By
Wider and Deeper LLM Networks are Fairer LLM Evaluators

Wider and Deeper LLM Networks are Fairer LLM Evaluators

3 August 2023
Xinghua Zhang
Yu Bowen
Haiyang Yu
Yangyu Lv
Tingwen Liu
Fei Huang
Hongbo Xu
Yongbin Li
    ALM
ArXivPDFHTML

Papers citing "Wider and Deeper LLM Networks are Fairer LLM Evaluators"

17 / 17 papers shown
Title
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
1
0
26 Apr 2025
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Yinhong Liu
Han Zhou
Zhijiang Guo
Ehsan Shareghi
Ivan Vulić
Anna Korhonen
Nigel Collier
ALM
128
68
0
20 Jan 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
113
65
0
25 Nov 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
73
0
0
12 Nov 2024
Do LLMs estimate uncertainty well in instruction-following?
Do LLMs estimate uncertainty well in instruction-following?
Juyeon Heo
Miao Xiong
Christina Heinze-Deml
Jaya Narain
ELM
48
3
0
18 Oct 2024
JudgeBench: A Benchmark for Evaluating LLM-based Judges
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Sijun Tan
Siyuan Zhuang
Kyle Montgomery
William Y. Tang
Alejandro Cuadron
Chenguang Wang
Raluca A. Popa
Ion Stoica
ELM
ALM
51
37
0
16 Oct 2024
DHP Benchmark: Are LLMs Good NLG Evaluators?
DHP Benchmark: Are LLMs Good NLG Evaluators?
Yicheng Wang
Jiayi Yuan
Yu-Neng Chuang
Zhuoer Wang
Yingchi Liu
Mark Cusick
Param Kulkarni
Zhengping Ji
Yasser Ibrahim
Xia Hu
LM&MA
ELM
43
3
0
25 Aug 2024
OffsetBias: Leveraging Debiased Data for Tuning Evaluators
OffsetBias: Leveraging Debiased Data for Tuning Evaluators
Junsoo Park
Seungyeon Jwa
Meiying Ren
Daeyoung Kim
Sanghyuk Choi
ALM
34
30
0
09 Jul 2024
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended
  Multi-Doc QA
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
Minzheng Wang
Longze Chen
Cheng Fu
Shengyi Liao
Xinghua Zhang
...
Run Luo
Yunshui Li
Min Yang
Fei Huang
Yongbin Li
RALM
42
44
0
25 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
38
38
0
06 Jun 2024
Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A
  Model-Based Reinforcement Learning Approach
Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A Model-Based Reinforcement Learning Approach
Yuxuan Chen
Rongpeng Li
Xiaoxue Yu
Zhifeng Zhao
Honggang Zhang
34
9
0
03 Jun 2024
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Jiawen Shi
Zenghui Yuan
Yinuo Liu
Yue Huang
Pan Zhou
Lichao Sun
Neil Zhenqiang Gong
AAML
43
39
0
26 Mar 2024
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
53
29
0
02 Feb 2024
Universal Self-Consistency for Large Language Model Generation
Universal Self-Consistency for Large Language Model Generation
Xinyun Chen
Renat Aksitov
Uri Alon
Jie Jessie Ren
Kefan Xiao
Pengcheng Yin
Sushant Prakash
Charles Sutton
Xuezhi Wang
Denny Zhou
LRM
24
65
0
29 Nov 2023
Generative Judge for Evaluating Alignment
Generative Judge for Evaluating Alignment
Junlong Li
Shichao Sun
Weizhe Yuan
Run-Ze Fan
Hai Zhao
Pengfei Liu
ELM
ALM
28
76
0
09 Oct 2023
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
410
595
0
21 Jul 2020
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
268
10,214
0
16 Nov 2016
1