Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.14016
Cited By
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment
21 February 2024
Vyas Raina
Adian Liusie
Mark J. F. Gales
AAML
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment"
13 / 13 papers shown
Title
LLM-as-a-Judge: Reassessing the Performance of LLMs in Extractive QA
Xanh Ho
Jiahao Huang
Florian Boudin
Akiko Aizawa
ELM
29
0
0
16 Apr 2025
Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework
Kaishuai Xu
Tiezheng YU
Wenjun Hou
Yi Cheng
Liangyou Li
Xin Jiang
Lifeng Shang
Q. Liu
Wenjie Li
ELM
66
0
0
26 Feb 2025
Investigating Non-Transitivity in LLM-as-a-Judge
Yi Xu
Laura Ruis
Tim Rocktaschel
Robert Kirk
38
0
0
19 Feb 2025
PRISMe: A Novel LLM-Powered Tool for Interactive Privacy Policy Assessment
Vincent Freiberger
Arthur Fleig
Erik Buchmann
40
0
0
28 Jan 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
108
61
0
25 Nov 2024
Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the effect of Epistemic Markers on LLM-based Evaluation
Dongryeol Lee
Yerin Hwang
Yongil Kim
Joonsuk Park
Kyomin Jung
ELM
68
4
0
28 Oct 2024
From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks
Andreas Stephan
D. Zhu
Matthias Aßenmacher
Xiaoyu Shen
Benjamin Roth
ELM
45
4
0
06 Sep 2024
CERT-ED: Certifiably Robust Text Classification for Edit Distance
Zhuoqun Huang
Yipeng Wang
Seunghee Shin
Benjamin I. P. Rubinstein
AAML
20
1
0
01 Aug 2024
MCRanker: Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers
Fang Guo
Wenyu Li
Honglei Zhuang
Yun Luo
Yafu Li
Qi Zhu
Le Yan
Yue Zhang
ALM
63
6
0
18 Apr 2024
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng
Tianhao Hu
Han Xu
Zhisong Zhang
Yong Dai
Lei Han
Nan Du
Nan Du
Xiaolong Li
SyDa
LRM
ReLM
87
28
0
16 Apr 2024
LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery
Samuel R. Bowman
Shi Feng
36
152
0
15 Apr 2024
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Lianghui Zhu
Xinggang Wang
Xinlong Wang
ELM
ALM
54
103
0
26 Oct 2023
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,798
0
14 Dec 2020
1