Logits-Constrained Framework with RoBERTa for Ancient Chinese NER

Abstract
This paper presents a Logits-Constrained (LC) framework for Ancient Chinese Named Entity Recognition (NER), evaluated on the EvaHan 2025 benchmark. Our two-stage model integrates GujiRoBERTa for contextual encoding and a differentiable decoding mechanism to enforce valid BMES label transitions. Experiments demonstrate that LC improves performance over traditional CRF and BiLSTM-based approaches, especially in high-label or large-data settings. We also propose a model selection criterion balancing label complexity and dataset size, providing practical guidance for real-world Ancient Chinese NLP tasks.
View on arXiv@article{hua2025_2505.02983, title={ Logits-Constrained Framework with RoBERTa for Ancient Chinese NER }, author={ Wenjie Hua and Shenghan Xu }, journal={arXiv preprint arXiv:2505.02983}, year={ 2025 } }
Comments on this paper