Measuring text summarization factuality using atomic facts entailment
metrics in the context of retrieval augmented generation
N. E. Kriman
- HILM
Main:10 Pages
4 Figures
Bibliography:2 Pages
3 Tables
Abstract
The use of large language models (LLMs) has significantly increased since the introduction of ChatGPT in 2022, demonstrating their value across various applications. However, a major challenge for enterprise and commercial adoption of LLMs is their tendency to generate inaccurate information, a phenomenon known as "hallucination." This project proposes a method for estimating the factuality of a summary generated by LLMs when compared to a source text. Our approach utilizes Naive Bayes classification to assess the accuracy of the content produced.
View on arXivComments on this paper
