ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.09347
  4. Cited By
Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts
v1v2v3 (latest)

Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts

Annual Meeting of the Association for Computational Linguistics (ACL), 2025
12 March 2025
Hongyu Chen
Seraphina Goldfarb-Tarrant
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts"

6 / 6 papers shown
Title
Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler
Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler
Zixuan Hu
Li Shen
Zhenyi Wang
Yongxian Wei
Dacheng Tao
AAML
107
0
0
31 Oct 2025
A Good Plan is Hard to Find: Aligning Models with Preferences is Misaligned with What Helps Users
A Good Plan is Hard to Find: Aligning Models with Preferences is Misaligned with What Helps Users
Nishant Balepur
Matthew Shu
Yoo Yeon Sung
Seraphina Goldfarb-Tarrant
Shi Feng
Fumeng Yang
Rachel Rudinger
Jordan L. Boyd-Graber
150
0
0
23 Sep 2025
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Khaoula Chehbouni
Mohammed Haddou
Jackie CK Cheung
G. Farnadi
LLMAG
273
5
0
25 Aug 2025
PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations
PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations
Ruosen Li
Teerth Patel
Xinya Du
LLMAGALM
467
124
0
03 Jan 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELMAILaw
974
240
0
25 Nov 2024
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
Aman Singh Thakur
Kartik Choudhary
Venkat Srinik Ramayapally
Sankaran Vaidyanathan
Dieuwke Hupkes
ELMALM
592
127
0
18 Jun 2024
1