Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation

27 August 2024

N. E. Kriman

HILM

ArXiv (abs)PDF HTML

Main:10 Pages

4 Figures

Bibliography:2 Pages

3 Tables

Abstract

The use of large language models (LLMs) has significantly increased since the introduction of ChatGPT in 2022, demonstrating their value across various applications. However, a major challenge for enterprise and commercial adoption of LLMs is their tendency to generate inaccurate information, a phenomenon known as "hallucination." This project proposes a method for estimating the factuality of a summary generated by LLMs when compared to a source text. Our approach utilizes Naive Bayes classification to assess the accuracy of the content produced.

View on arXiv

Comments on this paper