Punctuation Prediction for Polish Texts using Transformers
Conference on Computer Science and Information Systems (FedCSIS), 2023
Jakub Pokrywka
Main:2 Pages
Bibliography:2 Pages
Abstract
Speech recognition systems typically output text lacking punctuation. However, punctuation is crucial for written text comprehension. To tackle this problem, Punctuation Prediction models are developed. This paper describes a solution for Poleval 2022 Task 1: Punctuation Prediction for Polish Texts, which scores 71.44 Weighted F1. The method utilizes a single HerBERT model finetuned to the competition data and an external dataset.
View on arXivComments on this paper
