Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.08147
Cited By
RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
14 November 2023
Yi Liu
Lianzhe Huang
Shicheng Li
Sishuo Chen
Hao Zhou
Fandong Meng
Jie Zhou
Xu Sun
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge"
6 / 6 papers shown
Title
The Viability of Crowdsourcing for RAG Evaluation
Lukas Gienapp
Tim Hagen
Maik Frobe
Matthias Hagen
Benno Stein
Martin Potthast
Harrisen Scells
14
0
0
22 Apr 2025
A Survey on Knowledge-Oriented Retrieval-Augmented Generation
Mingyue Cheng
Yucong Luo
Jie Ouyang
Q. Liu
Huijie Liu
...
Bohou Zhang
Jiawei Cao
Jie Ma
Daoyu Wang
Enhong Chen
3DV
48
3
0
11 Mar 2025
ASTRID -- An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
Mohita Chowdhury
Yajie Vera He
Aisling Higham
Ernest Lim
53
1
0
14 Jan 2025
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
208
297
0
26 Apr 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
209
140
0
18 Apr 2021
1