Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation
Kaikai An
Fangkai Yang
Liqun Li
Junting Lu
Sitao Cheng
Lu Wang
Pu Zhao
Lele Cao
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang

Abstract
Current question answering systems leveraging retrieval augmented generation perform well in answering factoid questions but face challenges with non-factoid questions, particularly how-to queries requiring detailed step-by-step instructions and explanations. In this paper, we introduce Thread, a novel data organization paradigm that transforms documents into logic units based on their inter-connectivity. Extensive experiments across open-domain and industrial scenarios demonstrate that Thread outperforms existing data organization paradigms in RAG-based QA systems, significantly improving the handling of how-to questions.
View on arXivComments on this paper