SlovakBERT: Slovak Masked Language Model
Matúš Pikuliak
Stefan Grivalsky
Martin Konopka
Miroslav Blšták
Martin Tamajka
Viktor Bachratý
Marián Simko
Pavol Balázik
Michal Trnka
Filip Uhlárik

Abstract
We introduce a new Slovak masked language model called SlovakBERT. This is to our best knowledge the first paper discussing Slovak transformers-based language models. We evaluate our model on several NLP tasks and achieve state-of-the-art results. This evaluation is likewise the first attempt to establish a benchmark for Slovak language models. We publish the masked language model, as well as the fine-tuned models for part-of-speech tagging, sentiment analysis and semantic textual similarity.
View on arXivComments on this paper