v1v2v3 (latest)

Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?

27 March 2025

Papers citing "Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?"

7 / 7 papers shown

HEDGE: Hallucination Estimation via Dense Geometric Entropy for VQA with Vision-Language Models

201

16 Nov 2025

WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection

109

21 Oct 2025

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Graham Neubig

375

321

02 May 2024

Quantifying Uncertainty in Answers from any Language Model and Enhancing their TrustworthinessAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Jiuhai Chen

Jonas W. Mueller

360

113

30 Aug 2023

Judging LLM-as-a-Judge with MT-Bench and Chatbot ArenaNeural Information Processing Systems (NeurIPS), 2023

...

3.2K

6,617

09 Jun 2023

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical StudyInternational Joint Conference on Natural Language Processing (IJCNLP), 2023

Ruifeng Xu

417

115

03 Apr 2023

ELI5: Long Form Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2019

Angela Fan

Jason Weston

436

735

22 Jul 2019