Mining Word Boundaries from Speech-Text Parallel Data for Cross-domain
Chinese Word SegmentationInternational Conference on Computational Linguistics (COLING), 2024 |
Unsupervised Boundary-Aware Language Model Pretraining for Chinese
Sequence LabelingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Multi-CPR: A Multi Domain Chinese Dataset for Passage RetrievalAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022 |
PAEG: Phrase-level Adversarial Example Generation for Neural Machine
TranslationInternational Conference on Computational Linguistics (COLING), 2022 |