24
0

MHTS: Multi-Hop Tree Structure Framework for Generating Difficulty-Controllable QA Datasets for RAG Evaluation

Abstract

Existing RAG benchmarks often overlook query difficulty, leading to inflated performance on simpler questions and unreliable evaluations. A robust benchmark dataset must satisfy three key criteria: quality, diversity, and difficulty, which capturing the complexity of reasoning based on hops and the distribution of supporting evidence. In this paper, we propose MHTS (Multi-Hop Tree Structure), a novel dataset synthesis framework that systematically controls multi-hop reasoning complexity by leveraging a multi-hop tree structure to generate logically connected, multi-chunk queries. Our fine-grained difficulty estimation formula exhibits a strong correlation with the overall performance metrics of a RAG system, validating its effectiveness in assessing both retrieval and answer generation capabilities. By ensuring high-quality, diverse, and difficulty-controlled queries, our approach enhances RAG evaluation and benchmarking capabilities.

View on arXiv
@article{lee2025_2504.08756,
  title={ MHTS: Multi-Hop Tree Structure Framework for Generating Difficulty-Controllable QA Datasets for RAG Evaluation },
  author={ Jeongsoo Lee and Daeyong Kwon and Kyohoon Jin and Junnyeong Jeong and Minwoo Sim and Minwoo Kim },
  journal={arXiv preprint arXiv:2504.08756},
  year={ 2025 }
}
Comments on this paper