Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG
Chenhao Fang
Derek Larson
Shitong Zhu
Sophie Zeng
Wendy Summer
Yanqing Peng
Yuriy Hulovatyy
Rajeev Rao
Gabriel Forgues
Arya Pudota
Alex Goncalves
Hervé Robert

Abstract
This paper presents new methods that have the potential to improve privacy process efficiency with LLM and RAG. To reduce hallucination, we continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer. Our evaluations demonstrate that this approach enhances the model performance (as much as doubled metrics compared to out-of-box LLM) in handling privacy-related queries, by grounding responses with factual information which reduces inaccuracies.
View on arXivComments on this paper