309

Investigating the effect of sub-word segmentation on the performance of transformer language models

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Abstract

We would like to explore how morphemes can affect the performance of a language model. We trained GPT-2 and Bert model with StateMorph for both Finnish and Russian, which is a morpheme segmenting algorithm. As a comparison, we also trained a model with BPE and Morfessor. Our preliminary result shows that StateMorph can help the model to converge more efficiently and achieve a better validation score.

View on arXiv
Comments on this paper