Turkish Native Language Identification

27 July 2023

Ahmet Uluslu

Gerold Schneider

ArXiv (abs)PDF HTML Github

Main:2 Pages

4 Figures

3 Tables

Appendix:5 Pages

Abstract

In this paper, we present the first application of Native Language Identification (NLI) for the Turkish language. NLI involves predicting the writer's first language by analysing their writing in different languages. While most NLI research has focused on English, our study extends its scope to Turkish. We used the recently constructed Turkish Learner Corpus and employed a combination of three syntactic features (CFG production rules, part-of-speech n-grams, and function words) with L2 texts to demonstrate their effectiveness in this task.

View on arXiv

Comments on this paper