Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2506.09443
Cited By
v1
v2 (latest)
LLMs Cannot Reliably Judge (Yet?): A Comprehensive Assessment on the Robustness of LLM-as-a-Judge
11 June 2025
Songze Li
Chuokun Xu
Jiaying Wang
Xueluan Gong
Chen Chen
J. Zhang
Jun Wang
K. Lam
R. Beyah
AAML
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"LLMs Cannot Reliably Judge (Yet?): A Comprehensive Assessment on the Robustness of LLM-as-a-Judge"
4 / 4 papers shown
KEO: Knowledge Extraction on OMIn via Knowledge Graphs and RAG for Safety-Critical Aviation Maintenance
Kuangshi Ai
Jonathan A. Karr Jr.
Meng Jiang
Nitesh Chawla
Chaoli Wang
214
1
0
10 Apr 2026
Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions
Emi Soroka
Tanmay Chopra
Krish Desai
Sanjay Lall
ALM
392
0
0
04 Nov 2025
Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense
Zhehao Zhang
Weijie Xu
Shixian Cui
Chandan K. Reddy
AAML
LRM
169
0
0
17 Oct 2025
Knowledge-Graph Based RAG System Evaluation Framework
Sicheng Dong
Vahid Zolfaghari
Nenad Petrovic
Alois C. Knoll
182
0
0
02 Oct 2025
1
Page 1 of 1