Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.05372
Cited By
FlauBERT: Unsupervised Language Model Pre-training for French
11 December 2019
Hang Le
Loïc Vial
Jibril Frej
Vincent Segonne
Maximin Coavoux
Benjamin Lecouteux
A. Allauzen
Benoît Crabbé
Laurent Besacier
D. Schwab
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlauBERT: Unsupervised Language Model Pre-training for French"
50 / 159 papers shown
Title
ModernBERT or DeBERTaV3? Examining Architecture and Data Influence on Transformer Encoder Models Performance
Wissam Antoun
B. Sagot
Djamé Seddah
MQ
35
0
0
11 Apr 2025
Where Are We? Evaluating LLM Performance on African Languages
Ife Adebara
Hawau Olamide Toyin
Nahom Tesfu Ghebremichael
AbdelRahim Elmadany
Muhammad Abdul-Mageed
52
0
0
26 Feb 2025
Extraction multi-étiquettes de relations en utilisant des couches de Transformer
Ngoc Luyen Le
Gildas Tagny Ngompé
60
0
0
24 Feb 2025
Text Generation Models for Luxembourgish with Limited Data: A Balanced Multilingual Strategy
Alistair Plum
Tharindu Ranasinghe
Christoph Purschke
64
2
0
12 Dec 2024
Bilingual BSARD: Extending Statutory Article Retrieval to Dutch
Ehsan Lotfi
Nikolay Banar
Nerses Yuzbashyan
Walter Daelemans
AILaw
69
0
0
10 Dec 2024
Can bidirectional encoder become the ultimate winner for downstream applications of foundation models?
Lewen Yang
Xuanyu Zhou
Juao Fan
Xinyi Xie
Shengxin Zhu
AI4CE
64
0
0
27 Nov 2024
Training Bilingual LMs with Data Constraints in the Targeted Language
Skyler Seto
Maartje ter Hoeve
He Bai
Natalie Schluter
David Grangier
77
0
0
20 Nov 2024
VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Shailaja Keyur Sampat
Mutsumi Nakamura
Shankar Kailas
Kartik Aggarwal
Mandy Zhou
Yezhou Yang
Chitta Baral
MLLM
CoGe
ReLM
VLM
LRM
32
0
0
17 Oct 2024
Explanation sensitivity to the randomness of large language models: the case of journalistic text classification
Jérémie Bogaert
Marie-Catherine de Marneffe
Antonin Descampe
Louis Escouflaire
Cedrick Fairon
François-Xavier Standaert
19
1
0
07 Oct 2024
Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain
Antoine Louis
Gijs van Dijck
Gerasimos Spanakis
21
0
0
02 Sep 2024
Detecting the terminality of speech-turn boundary for spoken interactions in French TV and Radio content
Rémi Uro
Marie Tahon
D. Doukhan
Antoine Laurent
Albert Rilliard
25
0
0
14 Jun 2024
MTEB-French: Resources for French Sentence Embedding Evaluation and Analysis
Mathieu Ciancone
Imene Kerboua
Marion Schaeffer
W. Siblini
35
2
0
30 May 2024
Quantifying the Gain in Weak-to-Strong Generalization
Moses Charikar
Chirag Pabbaraju
Kirankumar Shiragur
ELM
24
16
0
24 May 2024
Language Models on a Diet: Cost-Efficient Development of Encoders for Closely-Related Languages via Additional Pretraining
Nikola Ljubesic
Vít Suchomel
Peter Rupnik
Taja Kuzman
Rik van Noord
CLL
19
5
0
08 Apr 2024
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
Yuemei Xu
Ling Hu
Jiayi Zhao
Zihan Qiu
Yuqi Ye
Hanwen Gu
LRM
19
36
0
01 Apr 2024
A Benchmark Evaluation of Clinical Named Entity Recognition in French
N. Bannour
Christophe Servan
Aurélie Névéol
Xavier Tannier
16
0
0
28 Mar 2024
VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding
Phong Nguyen-Thuan Do
Son Quoc Tran
Phu Gia Hoang
Kiet Van Nguyen
N. Nguyen
ELM
42
3
0
23 Mar 2024
Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models
Zhixue Zhao
Nikolaos Aletras
24
3
0
19 Mar 2024
Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for Pretraining on the Cybersecurity Domain
Eugene Jang
Jian Cui
Dayeon Yim
Youngjin Jin
Jin-Woo Chung
Seung-Eui Shin
Yongjae Lee
49
2
0
15 Mar 2024
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages
Rik van Noord
Taja Kuzman
Peter Rupnik
Nikola Ljubesic
Miquel Espla-Gomis
Gema Ramírez-Sánchez
Antonio Toral
ALM
25
1
0
13 Mar 2024
DrBenchmark: A Large Language Understanding Evaluation Benchmark for French Biomedical Domain
Yanis Labrak
Adrien Bazoge
Oumaima El Khettari
Mickael Rouvier
Pacome Constant dit Beaufils
...
B. Daille
Solen Quiniou
Emmanuel Morin
P. Gourraud
Richard Dufour
LM&MA
19
6
0
20 Feb 2024
Enhancing ESG Impact Type Identification through Early Fusion and Multilingual Models
Hariram Veeramani
Surendrabikram Thapa
Usman Naseem
6
5
0
16 Feb 2024
Traditional Machine Learning Models and Bidirectional Encoder Representations From Transformer (BERT)-Based Automatic Classification of Tweets About Eating Disorders: Algorithm Development and Validation Study
J. Benítez-Andrades
José-Manuel Alija-Pérez
Maria-Esther Vidal
R. Pastor-Vargas
María Teresa García-Ordás
11
36
0
08 Feb 2024
Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts
J. Benítez-Andrades
María Teresa García-Ordás
Mayra Russo
Ahmad Sakor
Luis Daniel Fernandes Rotger
Maria-Esther Vidal
AI4MH
82
3
0
08 Feb 2024
LLaMandement: Large Language Models for Summarization of French Legislative Proposals
Joseph Gesnouin
Yannis Tannier
Christophe Gomes Da Silva
Hatim Tapory
Camille Brier
...
Emmanuel Cortes
Pierre-Etienne Devineau
Ulrich Tan
Esther Mac Namara
Su Yang
AILaw
31
8
0
29 Jan 2024
Cascaded Cross-Modal Transformer for Audio-Textual Classification
Nicolae-Cătălin Ristea
Andrei Anghel
Radu Tudor Ionescu
28
2
0
15 Jan 2024
Building Efficient and Effective OpenQA Systems for Low-Resource Languages
Emrah Budur
Riza Ozccelik
Dilara Soylu
Omar Khattab
Tunga Güngör
Christopher Potts
30
1
0
07 Jan 2024
FREDSum: A Dialogue Summarization Corpus for French Political Debates
Virgile Rennard
Guokan Shang
Damien Grari
Julie Hunter
Michalis Vazirgiannis
23
3
0
08 Dec 2023
Spoken Dialogue System for Medical Prescription Acquisition on Smartphone: Development, Corpus and Evaluation
A. Kocabiyikoglu
Franccois Portet
Jean-Marc Babouchkine
Prudence Gibert
H. Blanchon
Gaetan Gavazzi
11
1
0
06 Nov 2023
H2O Open Ecosystem for State-of-the-art Large Language Models
Arno Candel
Jon McKinney
Philipp Singer
Pascal Pfeiffer
Maximilian Jeblick
Chun Ming Lee
Marcos V. Conde
VLM
17
4
0
17 Oct 2023
Acoustic and linguistic representations for speech continuous emotion recognition in call center conversations
Manon Macary
Marie Tahon
Yannick Esteve
Daniel Luzzati
21
3
0
06 Oct 2023
A Family of Pretrained Transformer Language Models for Russian
Dmitry Zmitrovich
Alexander Abramov
Andrey Kalmykov
Maria Tikhonova
Ekaterina Taktasheva
...
Vitalii Kadulin
Sergey Markov
Tatiana Shavrina
Vladislav Mikhailov
Alena Fenogenova
28
26
0
19 Sep 2023
Balanced and Explainable Social Media Analysis for Public Health with Large Language Models
Yan Jiang
Ruihong Qiu
Yi Zhang
Peng Zhang
8
7
0
12 Sep 2023
The DeepZen Speech Synthesis System for Blizzard Challenge 2023
C. Veaux
R. Maia
Spyridoula Papendreou
18
1
0
30 Aug 2023
Multiscale Contextual Learning for Speech Emotion Recognition in Emergency Call Center Conversations
Théo Deschamps-Berger
L. Lamel
Laurence Devillers
19
2
0
28 Aug 2023
Spanish Pre-trained BERT Model and Evaluation Data
J. Cañete
Gabriel Chaperon
Rodrigo Fuentes
Jou-Hui Ho
Hojin Kang
Jorge Pérez
25
654
0
06 Aug 2023
Cascaded Cross-Modal Transformer for Request and Complaint Detection
Nicolae-Cătălin Ristea
Radu Tudor Ionescu
18
3
0
27 Jul 2023
CamemBERT-bio: Leveraging Continual Pre-training for Cost-Effective Models on French Biomedical Data
Rian Touchent
Laurent Romary
Eric Villemonte de la Clergerie
MedIm
13
4
0
27 Jun 2023
Exploring Attention Mechanisms for Multimodal Emotion Recognition in an Emergency Call Center Corpus
Théo Deschamps-Berger
L. Lamel
Laurence Devillers
26
8
0
12 Jun 2023
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks
Saidul Islam
Hanae Elmekki
Ahmed Elsebai
Jamal Bentahar
Najat Drawel
Gaith Rjoub
Witold Pedrycz
ViT
MedIm
19
170
0
11 Jun 2023
bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
Momchil Hardalov
Pepa Atanasova
Todor Mihaylov
G. Angelova
K. Simov
P. Osenova
Ves Stoyanov
Ivan Koychev
Preslav Nakov
Dragomir R. Radev
ELM
FedML
21
4
0
04 Jun 2023
Impact of translation on biomedical information extraction from real-life clinical notes
C. Gérardin
Yu Xiong
Perceval Wajsburt
F. Carrat
X. Tannier
11
1
0
03 Jun 2023
Data-Efficient French Language Modeling with CamemBERTa
Wissam Antoun
Benoît Sagot
Djamé Seddah
15
7
0
02 Jun 2023
DUMB: A Benchmark for Smart Evaluation of Dutch Models
Wietse de Vries
Martijn B. Wieling
Malvina Nissim
ELM
ALM
MoE
26
6
0
22 May 2023
Systematic Review on Reinforcement Learning in the Field of Fintech
Nadeem Malibari
Iyad A. Katib
Rashid Mehmood
OffRL
13
3
0
29 Apr 2023
Automatic ICD-10 Code Association: A Challenging Task on French Clinical Texts
Yakini Tchouka
Jean-François Couchot
David Laiymani
Philippe Selles
Azzedine Rahmani
25
3
0
06 Apr 2023
A Bibliometric Review of Large Language Models Research from 2017 to 2023
Lizhou Fan
Lingyao Li
Zihui Ma
Sanggyu Lee
Huizi Yu
Libby Hemphill
34
147
0
03 Apr 2023
SwissBERT: The Multilingual Language Model for Switzerland
Jannis Vamvas
Johannes Graen
Rico Sennrich
25
6
0
23 Mar 2023
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset
Thanh-Dung Le
P. Jouvet
R. Noumeir
MoE
MedIm
67
5
0
22 Mar 2023
Alloprof: a new French question-answer education dataset and its use in an information retrieval case study
Antoine Lefebvre-Brossard
Stephane Gazaille
Michel C. Desmarais
AI4Ed
19
1
0
10 Feb 2023
1
2
3
4
Next