ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.08794
  4. Cited By
One Token to Fool LLM-as-a-Judge
v1v2 (latest)

One Token to Fool LLM-as-a-Judge

11 July 2025
Yulai Zhao
Haolin Liu
Dian Yu
Sunyuan Kung
Meijia Chen
Haitao Mi
Dong Yu
    OffRLLRM
ArXiv (abs)PDFHTMLHuggingFace (29 upvotes)

Papers citing "One Token to Fool LLM-as-a-Judge"

10 / 10 papers shown
Title
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
Qiyuan Liu
Hao Xu
Xuhong Chen
Wei Chen
Yee Whye Teh
Ning Miao
ReLMLRMAI4CE
74
0
0
02 Oct 2025
Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers
Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers
Xin-Qiang Cai
Wei Wang
Feng Liu
Tongliang Liu
Gang Niu
Masashi Sugiyama
OffRLAAML
4
0
0
01 Oct 2025
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Dawei Li
Zhen Tan
Chengshuai Zhao
Bohan Jiang
Baixiang Huang
Pingchuan Ma
Abdullah Alnaibari
Kai Shu
Huan Liu
4
0
0
29 Sep 2025
SCI-Verifier: Scientific Verifier with Thinking
SCI-Verifier: Scientific Verifier with Thinking
Shenghe Zheng
Chenyu Huang
F. Yu
Junchi Yao
Jingqi Ye
...
Yun Luo
Ning Ding
Lei Bai
Ganqu Cui
Peng Ye
LRM
12
0
0
29 Sep 2025
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards
Aaron Tu
Weihao Xuan
Heli Qi
X. Y. Huang
Qingcheng Zeng
...
Amin Saberi
Naoto Yokoya
Jure Leskovec
Yejin Choi
Fang Wu
OffRL
8
0
0
26 Sep 2025
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
Yujun Zhou
Zhenwen Liang
Haolin Liu
Wenhao Yu
Kishan Panaganti
Linfeng Song
Dian Yu
Xiangliang Zhang
Haitao Mi
Dong Yu
12
6
0
18 Sep 2025
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Runpeng Dai
Linfeng Song
Haolin Liu
Zhenwen Liang
Dian Yu
...
Zhaopeng Tu
R. Liu
Tong Zheng
Hongtu Zhu
Dong Yu
LRM
24
3
0
11 Sep 2025
Better Language Model-Based Judging Reward Modeling through Scaling Comprehension Boundaries
Better Language Model-Based Judging Reward Modeling through Scaling Comprehension Boundaries
Meiling Ning
Zhongbao Zhang
Junda Ye
Jiabao Guo
Qingyuan Guan
LRM
40
0
0
25 Aug 2025
R-Zero: Self-Evolving Reasoning LLM from Zero Data
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Chengsong Huang
Wenhao Yu
Xiaoyang Wang
H. Zhang
Zongxia Li
Ruosen Li
J. Huang
Haitao Mi
Dong Yu
ReLMSyDaLRM
70
14
0
07 Aug 2025
PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?
PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?
Lingfeng Zhou
Jialing Zhang
Jin Gao
Mohan Jiang
Dequan Wang
ELM
46
1
0
06 Aug 2025
1