Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.15930
Cited By
WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models
27 November 2023
Youssef Benchekroun
Megi Dervishi
Mark Ibrahim
Jean-Baptiste Gaya
Xavier Martinet
Grégoire Mialon
Thomas Scialom
Emmanuel Dupoux
Dieuwke Hupkes
Pascal Vincent
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models"
3 / 3 papers shown
Title
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
Aman Singh Thakur
Kartik Choudhary
Venkat Srinik Ramayapally
Sankaran Vaidyanathan
Dieuwke Hupkes
ELM
ALM
45
55
0
18 Jun 2024
Don't Make Your LLM an Evaluation Benchmark Cheater
Kun Zhou
Yutao Zhu
Zhipeng Chen
Wentong Chen
Wayne Xin Zhao
Xu Chen
Yankai Lin
Ji-Rong Wen
Jiawei Han
ELM
105
136
0
03 Nov 2023
Unbiased Math Word Problems Benchmark for Mitigating Solving Bias
Zhicheng YANG
Jinghui Qin
Jiaqi Chen
Xiaodan Liang
59
12
0
17 May 2022
1