Despite advancements in grounded content generation, production Large Language Models (LLMs) based applications still suffer from hallucinated answers. We present "Grounded in Context" - Deepchecks' hallucination detection framework, designed for production-scale long-context data and tailored to diverse use cases, including summarization, data extraction, and RAG. Inspired by RAG architecture, our method integrates retrieval and Natural Language Inference (NLI) models to predict factual consistency between premises and hypotheses using an encoder-based model with only a 512-token context window. Our framework identifies unsupported claims with an F1 score of 0.83 in RAGTruth's response-level classification task, matching methods that trained on the dataset, and outperforming all comparable frameworks using similar-sized models.
View on arXiv@article{gerner2025_2504.15771, title={ Grounded in Context: Retrieval-Based Method for Hallucination Detection }, author={ Assaf Gerner and Netta Madvil and Nadav Barak and Alex Zaikman and Jonatan Liberman and Liron Hamra and Rotem Brazilay and Shay Tsadok and Yaron Friedman and Neal Harow and Noam Bresler and Shir Chorev and Philip Tannor }, journal={arXiv preprint arXiv:2504.15771}, year={ 2025 } }