318
v1v2 (latest)

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Main:1 Pages
5 Figures
12 Tables
Appendix:16 Pages
Abstract

Accurately attributing answer text to its source document is crucial for developing a reliable question-answering system. However, attribution for long documents remains largely unexplored. Post-hoc attribution systems are designed to map answer text back to the source document, yet the granularity of this mapping has not been addressed. Furthermore, a critical question arises: What exactly should be attributed? This involves identifying the specific information units within an answer that require grounding. In this paper, we propose and investigate a novel approach to the factual decomposition of generated answers for attribution, employing template-based in-context learning. To accomplish this, we utilize the question and integrate negative sampling during few-shot in-context learning for decomposition. This approach enhances the semantic understanding of both abstractive and extractive answers. We examine the impact of answer decomposition by providing a thorough examination of various attribution approaches, ranging from retrieval-based techniques to LLM-based attributors.

View on arXiv
Comments on this paper