22
0

Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation

Jiajun Shen
Tong Zhou
Yubo Chen
Delai Qiu
Shengping Liu
Kang Liu
Jun Zhao
Abstract

While hallucinations of large language models could been alleviated through retrieval-augmented generation and citation generation, how the model utilizes internal knowledge is still opaque, and the trustworthiness of its generated answers remains questionable. In this work, we introduce Context-Prior Augmented Citation Generation task, requiring models to generate citations considering both external and internal knowledge while providing trustworthy references, with 5 evaluation metrics focusing on 3 aspects: answer helpfulness, citation faithfulness, and trustworthiness. We introduce RAEL, the paradigm for our task, and also design INTRALIGN, an integrated method containing customary data generation and an alignment algorithm. Our experimental results show that our method achieves a better cross-scenario performance with regard to other baselines. Our extended experiments further reveal that retrieval quality, question types, and model knowledge have considerable influence on the trustworthiness in citation generation.

View on arXiv
@article{shen2025_2504.14856,
  title={ Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation },
  author={ Jiajun Shen and Tong Zhou and Yubo Chen and Delai Qiu and Shengping Liu and Kang Liu and Jun Zhao },
  journal={arXiv preprint arXiv:2504.14856},
  year={ 2025 }
}
Comments on this paper