Multilingual is not enough: BERT for Finnish

15 December 2019

Papers citing "Multilingual is not enough: BERT for Finnish"

44 / 44 papers shown

Title
Finnish SQuAD: A Simple Approach to Machine Translation of Span Annotations Emil Nuutinen Iiro Rastas Filip Ginter 40 1 0 10 Jan 2025
Transformer-based Entity Legal Form Classification Alexander Arimond Mauro Molteni Dominik Jany Zornitsa Manolova Damian Borth Andreas G. F. Hoepner MedIm AILaw 17 1 0 19 Oct 2023
Testing the Predictions of Surprisal Theory in 11 Languages Ethan Gotlieb Wilcox Tiago Pimentel Clara Meister Ryan Cotterell R. Levy LRM 44 63 0 07 Jul 2023
Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation Andrei-Marius Avram V. Mititelu V. Pais Dumitru-Clementin Cercel Stefan Trausan-Matu 38 3 0 17 Jun 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models Orevaoghene Ahia Sachin Kumar Hila Gonen Jungo Kasai David R. Mortensen Noah A. Smith Yulia Tsvetkov 40 80 0 23 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov Emanuele La Malfa Philip H. S. Torr Adel Bibi 16 96 0 17 May 2023
Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages Verena Blaschke Hinrich Schütze Barbara Plank 34 14 0 20 Apr 2023
BERTino: an Italian DistilBERT model Matteo Muffo E. Bertino VLM 14 14 0 31 Mar 2023
Cross-lingual German Biomedical Information Extraction: from Zero-shot to Human-in-the-Loop Siting Liang Mareike Hartmann Daniel Sonntag 13 3 0 24 Jan 2023
MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code Completion Zi Gong Yinpeng Guo Pingyi Zhou Cuiyun Gao Yasheng Wang Zenglin Xu 12 8 0 19 Dec 2022
Applying Multilingual Models to Question Answering (QA) Ayrton San Joaquin Filip Skubacz 18 1 0 04 Dec 2022
Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge Kosuke Nishida Naoki Yoshinaga Kyosuke Nishida 22 2 0 14 Oct 2022
Sort by Structure: Language Model Ranking as Dependency Probing Max Müller-Eberstein Rob van der Goot Barbara Plank 33 3 0 10 Jun 2022
State-of-the-art in Open-domain Conversational AI: A Survey Tosin P. Adewumi F. Liwicki Marcus Liwicki 24 15 0 02 May 2022
You Are What You Write: Preserving Privacy in the Era of Large Language Models Richard Plant V. Giuffrida Dimitra Gkatzia PILM 20 19 0 20 Apr 2022
BERTuit: Understanding Spanish language in Twitter through a native transformer Javier Huertas-Tato Alejandro Martín David Camacho 18 9 0 07 Apr 2022
Punctuation restoration in Swedish through fine-tuned KB-BERT J. Nilsson 11 0 0 14 Feb 2022
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey Bonan Min Hayley L Ross Elior Sulem Amir Pouran Ben Veyseh Thien Huu Nguyen Oscar Sainz Eneko Agirre Ilana Heinz Dan Roth LM&MA VLM AI4CE 71 1,029 0 01 Nov 2021
Cross-lingual Transfer of Monolingual Models Evangelia Gogoulou Ariel Ekgren T. Isbister Magnus Sahlgren 27 16 0 15 Sep 2021
Evaluating Transferability of BERT Models on Uralic Languages Judit Ács Dániel Lévai András Kornai 19 6 0 13 Sep 2021
MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer Ilias Chalkidis Manos Fergadiotis Ion Androutsopoulos AILaw 16 106 0 02 Sep 2021
Are the Multilingual Models Better? Improving Czech Sentiment with Transformers Pavel Přibáň J. Steinberger 30 11 0 24 Aug 2021
PyEuroVoc: A Tool for Multilingual Legal Document Classification with EuroVoc Descriptors Andrei-Marius Avram V. Pais D. Tufis AILaw VLM 19 17 0 02 Aug 2021
Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition Abbas Ghaddar Philippe Langlais Ahmad Rashid Mehdi Rezagholizadeh 34 42 0 24 Jul 2021
Evaluation of contextual embeddings on less-resourced languages Matej Ulvcar Alevs vZagar C. S. Armendariz Andravz Repar Senja Pollak Matthew Purver Marko Robnik-vSikonja 22 11 0 22 Jul 2021
Are Multilingual Models the Best Choice for Moderately Under-resourced Languages? A Comprehensive Assessment for Catalan Jordi Armengol-Estapé C. Carrino Carlos Rodríguez-Penagos Ona de Gibert Bonet Carme Armentano-Oller Aitor Gonzalez-Agirre Maite Melero Marta Villegas 60 42 0 16 Jul 2021
A Primer on Pretrained Multilingual Language Models Sumanth Doddapaneni Gowtham Ramesh Mitesh M. Khapra Anoop Kunchukuttan Pratyush Kumar LRM 43 73 0 01 Jul 2021
RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model Milan Straka Jakub Náplava Jana Straková David Samuel 23 47 0 24 May 2021
Quantitative Evaluation of Alternative Translations in a Corpus of Highly Dissimilar Finnish Paraphrases Li-Hsin Chang S. Pyysalo Jenna Kanerva Filip Ginter 18 2 0 06 May 2021
HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish Robert Mroczkowski Piotr Rybak Alina Wróblewska Ireneusz Gawlik 28 81 0 04 May 2021
Deep learning for sentence clustering in essay grading support Li-Hsin Chang Iiro Rastas S. Pyysalo Filip Ginter 24 8 0 23 Apr 2021
Bertinho: Galician BERT Representations David Vilares Marcos Garcia Carlos Gómez-Rodríguez 57 22 0 25 Mar 2021
Czert -- Czech BERT-like Model for Language Representation Jakub Sido O. Pražák P. Pribán Jan Pasek Michal Seják Miloslav Konopík 16 43 0 24 Mar 2021
Pre-Training BERT on Arabic Tweets: Practical Considerations Ahmed Abdelali Sabit Hassan Hamdy Mubarak Kareem Darwish Younes Samih 20 96 0 21 Feb 2021
EstBERT: A Pretrained Language-Specific BERT for Estonian Hasan Tanvir Claudia Kittask Sandra Eiche Kairit Sirts 12 36 0 09 Nov 2020
German's Next Language Model Branden Chan Stefan Schweter Timo Möller 22 263 0 21 Oct 2020
The birth of Romanian BERT Stefan Daniel Dumitrescu Andrei-Marius Avram S. Pyysalo VLM 8 76 0 18 Sep 2020
KR-BERT: A Small-Scale Korean-Specific Language Model Sangah Lee Hansol Jang Yunmee Baik Suzi Park Hyopil Shin 22 51 0 10 Aug 2020
FinEst BERT and CroSloEngual BERT: less is more in multilingual models Matej Ulvcar Marko Robnik-Šikonja 11 48 0 14 Jun 2020
Pre-trained Models for Natural Language Processing: A Survey Xipeng Qiu Tianxiang Sun Yige Xu Yunfan Shao Ning Dai Xuanjing Huang LM&MA VLM 243 1,450 0 18 Mar 2020
BERTje: A Dutch BERT Model Wietse de Vries Andreas van Cranenburgh Arianna Bisazza Tommaso Caselli Gertjan van Noord Malvina Nissim VLM SSeg 11 291 0 19 Dec 2019
FlauBERT: Unsupervised Language Model Pre-training for French Hang Le Loïc Vial Jibril Frej Vincent Segonne Maximin Coavoux Benjamin Lecouteux A. Allauzen Benoît Crabbé Laurent Besacier D. Schwab AI4CE 35 395 0 11 Dec 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 199 882 0 03 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,956 0 20 Apr 2018