39
0

Coreference Resolution for Vietnamese Narrative Texts

Abstract

Coreference resolution is a vital task in natural language processing (NLP) that involves identifying and linking different expressions in a text that refer to the same entity. This task is particularly challenging for Vietnamese, a low-resource language with limited annotated datasets. To address these challenges, we developed a comprehensive annotated dataset using narrative texts from VnExpress, a widely-read Vietnamese online news platform. We established detailed guidelines for annotating entities, focusing on ensuring consistency and accuracy. Additionally, we evaluated the performance of large language models (LLMs), specifically GPT-3.5-Turbo and GPT-4, on this dataset. Our results demonstrate that GPT-4 significantly outperforms GPT-3.5-Turbo in terms of both accuracy and response consistency, making it a more reliable tool for coreference resolution in Vietnamese.

View on arXiv
@article{tran2025_2504.19606,
  title={ Coreference Resolution for Vietnamese Narrative Texts },
  author={ Hieu-Dai Tran and Duc-Vu Nguyen and Ngan Luu-Thuy Nguyen },
  journal={arXiv preprint arXiv:2504.19606},
  year={ 2025 }
}
Comments on this paper