FlauBERT: Unsupervised Language Model Pre-training for French

11 December 2019

Papers citing "FlauBERT: Unsupervised Language Model Pre-training for French"

50 / 159 papers shown

Title
Fillers in Spoken Language Understanding: Computational and Psycholinguistic Perspectives Tanvi Dinkar Chloé Clavel I. Vasilescu 16 10 0 25 Jan 2023
Adversarial Adaptation for French Named Entity Recognition Arjun Choudhry Inder Khatri Pankaj Gupta Aaryan Gupta Maxime Nicol Marie-Jean Meurs Dinesh Kumar Vishwakarma 11 0 0 12 Jan 2023
ORCA: A Challenging Benchmark for Arabic Language Understanding AbdelRahim Elmadany El Moatez Billah Nagoudi Muhammad Abdul-Mageed ELM 17 40 0 21 Dec 2022
Ensembling Transformers for Cross-domain Automatic Term Extraction T. Hanh Matej Martinc Andraz Pelicon Antoine Doucet Senja Pollak 16 5 0 12 Dec 2022
Transformer-Based Named Entity Recognition for French Using Adversarial Adaptation to Similar Domain Corpora Arjun Choudhry Pankaj Gupta Inder Khatri Aaryan Gupta Maxime Nicol Marie-Jean Meurs Dinesh Kumar Vishwakarma 16 5 0 05 Dec 2022
Human-in-the-Loop Hate Speech Classification in a Multilingual Context Ana Kotarcic Dominik Hangartner Fabrizio Gilardi Selina Kurer K. Donnay 24 2 0 05 Dec 2022
This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish Lukasz Augustyniak Kamil Tagowski Albert Sawczyn Denis Janiak Roman Bartusiak ... Arkadiusz Janz Piotr Szymañski M. Morzy Tomasz Kajdanowicz Maciej Piasecki 18 10 0 23 Nov 2022
Local Structure Matters Most in Most Languages Louis Clouâtre Prasanna Parthasarathi Amal Zouaq Sarath Chandar 26 1 0 09 Nov 2022
Detecting Languages Unintelligible to Multilingual Models through Local Structure Probes Louis Clouâtre Prasanna Parthasarathi Amal Zouaq Sarath Chandar 33 3 0 09 Nov 2022
An Easy-to-use and Robust Approach for the Differentially Private De-Identification of Clinical Textual Documents Yakini Tchouka Jean-François Couchot David Laiymani OOD 26 1 0 02 Nov 2022
De-Identification of French Unstructured Clinical Notes for Machine Learning Tasks Yakini Tchouka Jean-François Couchot Maxime Coulmeau David Laiymani Philippe Selles Azzedine Rahmani OOD 6 5 0 16 Sep 2022
BERTifying Sinhala -- A Comprehensive Analysis of Pre-trained Language Models for Sinhala Text Classification Vinura Dhananjaya Piyumal Demotte Surangika Ranathunga Sanath Jayasena 24 13 0 16 Aug 2022
Compositional Evaluation on Japanese Textual Entailment and Similarity Hitomi Yanaka K. Mineshima 14 24 0 09 Aug 2022
Learning structures of the French clinical language:development and validation of word embedding models using 21 million clinical reports from electronic health records Basile Dura Charline Jean X. Tannier Alice Calliger R. Bey A. Neuraz R. Flicoteaux 11 11 0 26 Jul 2022
Benchmarking Transformers-based models on French Spoken Language Understanding tasks Oralie Cattan Sahar Ghannay Christophe Servan Sophie Rosset 28 4 0 19 Jul 2022
On the Usability of Transformers-based models for a French Question-Answering task Oralie Cattan Christophe Servan Sophie Rosset 12 14 0 19 Jul 2022
Multilingual Transformer Encoders: a Word-Level Task-Agnostic Evaluation Félix Gaschi François Plesse Parisa Rastin Y. Toussaint 20 8 0 19 Jul 2022
Effectiveness of French Language Models on Abstractive Dialogue Summarization Task Yongxin Zhou Franccois Portet F. Ringeval 19 8 0 17 Jul 2022
A Spoken Drug Prescription Dataset in French for Spoken Language Understanding A. Kocabiyikoglu Franccois Portet Prudence Gibert H. Blanchon Jean-Marc Babouchkine Gaetan Gavazzi 11 2 0 17 Jul 2022
Multimodal E-Commerce Product Classification Using Hierarchical Fusion Tsegaye Misikir Tashu Sara Fattouh Peter Kiss Tomáš Horváth 22 1 0 07 Jul 2022
ASR-Generated Text for Language Model Pre-training Applied to Speech Tasks Valentin Pelloin Franck Dary Nicolas Hervé Benoit Favre Nathalie Camelin Antoine Laurent Laurent Besacier 30 6 0 05 Jul 2022
Discovering Salient Neurons in Deep NLP Models Nadir Durrani Fahim Dalvi Hassan Sajjad KELM MILM 14 15 0 27 Jun 2022
DistilCamemBERT: a distillation of the French model CamemBERT Cyrile Delestre Abibatou Amar 24 5 0 23 May 2022
Blackbird's language matrices (BLMs): a new benchmark to investigate disentangled generalisation in neural networks Paola Merlo A. An M. A. Rodriguez 15 9 0 22 May 2022
Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Understanding Abbas Ghaddar Yimeng Wu Sunyam Bagga Ahmad Rashid Khalil Bibi ... Zhefeng Wang Baoxing Huai Xin Jiang Qun Liu Philippe Langlais 22 6 0 21 May 2022
Evaluation of Transfer Learning for Polish with a Text-to-Text Model Aleksandra Chrabrowa Lukasz Dragan Karol Grzegorczyk D. Kajtoch Mikołaj Koszowski Robert Mroczkowski Piotr Rybak 29 18 0 18 May 2022
TiBERT: Tibetan Pre-trained Language Model Yuan Sun Sisi Liu Junjie Deng Xiaobing Zhao 59 9 0 15 May 2022
Named Entity Recognition for Audio De-Identification Guillaume Baril P. Cardinal Alessandro Lameiras Koerich 14 3 0 26 Apr 2022
You Are What You Write: Preserving Privacy in the Era of Large Language Models Richard Plant V. Giuffrida Dimitra Gkatzia PILM 15 19 0 20 Apr 2022
ALBETO and DistilBETO: Lightweight Spanish Language Models J. Canete S. Donoso Felipe Bravo-Marquez Andrés Carvallo Vladimir Araujo 35 20 0 19 Apr 2022
Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi Abhishek Velankar H. Patil Raviraj Joshi 28 31 0 19 Apr 2022
mGPT: Few-Shot Learners Go Multilingual Oleh Shliazhko Alena Fenogenova Maria Tikhonova Vladislav Mikhailov Anastasia Kozlova Tatiana Shavrina 38 148 0 15 Apr 2022
KOBEST: Korean Balanced Evaluation of Significant Tasks Dohyeong Kim Myeongjun Jang D. Kwon Eric Davis ALM 6 23 0 09 Apr 2022
CINO: A Chinese Minority Pre-trained Language Model Ziqing Yang Zihang Xu Yiming Cui Baoxin Wang Min-Bin Lin Dayong Wu Zhigang Chen 21 25 0 28 Feb 2022
Oolong: Investigating What Makes Transfer Learning Hard with Controlled Studies Zhengxuan Wu Alex Tamkin Isabel Papadimitriou 21 9 0 24 Feb 2022
Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models Alena Fenogenova Maria Tikhonova Vladislav Mikhailov Tatiana Shavrina Anton A. Emelyanov Denis Shevelev Alexander Kukushkin Valentin Malykh Ekaterina Artemova AAML VLM ELM 14 2 0 15 Feb 2022
Cedille: A large autoregressive French language model Martin Müller Florian Laurent 34 19 0 07 Feb 2022
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data Yaoqing Yang Ryan Theisen Liam Hodgkinson Joseph E. Gonzalez Kannan Ramchandran Charles H. Martin Michael W. Mahoney 82 17 0 06 Feb 2022
L3Cube-MahaCorpus and MahaBERT: Marathi Monolingual Corpus, Marathi BERT Language Models, and Resources Raviraj Joshi 41 52 0 02 Feb 2022
Speech Resources in the Tamasheq Language Marcely Zanon Boito Fethi Bougares Florentin Barbier Souhir Gahbiche Loïc Barrault Mickael Rouvier Yannick Esteve 26 14 0 13 Jan 2022
Automatic Pharma News Categorization S. Adaszewski P. Kuner Ralf J. Jaeger OOD 16 3 0 28 Dec 2021
JABER and SABER: Junior and Senior Arabic BERt Abbas Ghaddar Yimeng Wu Ahmad Rashid Khalil Bibi Mehdi Rezagholizadeh ... Zhefeng Wang Baoxing Huai Xin Jiang Qun Liu Philippe Langlais 16 5 0 08 Dec 2021
ADBCMM : Acronym Disambiguation by Building Counterfactuals and Multilingual Mixing Yixuan Weng Fei Xia Bin Li Xiusheng Huang Shizhu He 4 4 0 08 Dec 2021
TunBERT: Pretrained Contextualized Text Representation for Tunisian Dialect Abir Messaoudi Ahmed Cheikhrouhou Hatem Haddad Nourchene Ferchichi Moez BenHajhmida Abir Korched Malek Naski Faten Ghriss Amine Kerkeni 11 8 0 25 Nov 2021
Personalized Benchmarking with the Ludwig Benchmarking Toolkit A. Narayan Piero Molino Karan Goel W. Neiswanger Christopher Ré 6 11 0 08 Nov 2021
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey Bonan Min Hayley L Ross Elior Sulem Amir Pouran Ben Veyseh Thien Huu Nguyen Oscar Sainz Eneko Agirre Ilana Heinz Dan Roth LM&MA VLM AI4CE 69 1,029 0 01 Nov 2021
Findings from Experiments of On-line Joint Reinforcement Learning of Semantic Parser and Dialogue Manager with real Users Matthieu Riou Bassam Jabaian Stéphane Huet F. Lefèvre OffRL 14 0 0 25 Oct 2021
Generating artificial texts as substitution or complement of training data Vincent Claveau Antoine Chaffin Ewa Kijak 23 9 0 25 Oct 2021
PAGnol: An Extra-Large French Generative Model Julien Launay E. L. Tommasone B. Pannier Franccois Boniface A. Chatelain Alessandro Cappelli Iacopo Poli Djamé Seddah AILaw MoE AI4CE 38 8 0 16 Oct 2021
PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding Antoine Chaffin Vincent Claveau Ewa Kijak 23 36 0 28 Sep 2021