Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.03813
Cited By
v1
v2 (latest)
On the importance of pre-training data volume for compact language models
8 October 2020
Vincent Micheli
Martin d'Hoffschmidt
Franccois Fleuret
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On the importance of pre-training data volume for compact language models"
20 / 20 papers shown
Title
Spontaneous Speech Variables for Evaluating LLMs Cognitive Plausibility
Sheng-Fu Wang
Laurent Prevot
Jou-an Chi
Ri-Sheng Huang
Shu-Kai Hsieh
LRM
65
0
0
22 May 2025
TAROT: A Hierarchical Framework with Multitask Co-Pretraining on Semi-Structured Data towards Effective Person-Job Fit
Yihan Cao
Xu Chen
Lun Du
Hao Chen
Qiang Fu
Shi Han
Yushu Du
Yanbin Kang
Guangming Lu
Zi Li
54
1
0
15 Jan 2024
Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages?
Luke Gessler
Nathan Schneider
46
1
0
01 Nov 2023
Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
David Samuel
54
4
0
30 Oct 2023
Data-Efficient French Language Modeling with CamemBERTa
Wissam Antoun
Benoît Sagot
Djamé Seddah
52
7
0
02 Jun 2023
Trained on 100 million words and still in shape: BERT meets British National Corpus
David Samuel
Andrey Kutuzov
Lilja Øvrelid
Erik Velldal
101
32
0
17 Mar 2023
MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning
Luke Gessler
Amir Zeldes
79
14
0
23 Dec 2022
Feature-Level Debiased Natural Language Understanding
Yougang Lyu
Piji Li
Yechang Yang
Maarten de Rijke
Pengjie Ren
Yukun Zhao
D. Yin
Zhaochun Ren
91
12
0
11 Dec 2022
Benchmarking Transformers-based models on French Spoken Language Understanding tasks
Oralie Cattan
Sahar Ghannay
Christophe Servan
Sophie Rosset
49
4
0
19 Jul 2022
On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model
Seongjin Shin
Sang-Woo Lee
Hwijeen Ahn
Sungdong Kim
Hyoungseok Kim
...
Kyunghyun Cho
Gichang Lee
W. Park
Jung-Woo Ha
Nako Sung
LRM
117
97
0
28 Apr 2022
Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability
Yoshinari Fujinuma
Jordan L. Boyd-Graber
Katharina Kann
AAML
143
26
0
21 Mar 2022
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
Arij Riabi
Benoît Sagot
Djamé Seddah
85
15
0
26 Oct 2021
FQuAD2.0: French Question Answering and knowing that you know nothing
Quentin Heinrich
Gautier Viaud
Wacim Belblidia
66
8
0
27 Sep 2021
Legal Transformer Models May Not Always Help
Sakbo Geng
R. Lebret
Karl Aberer
VLM
AILaw
ELM
53
12
0
14 Sep 2021
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets
Cheng-Han Chiang
Hung-yi Lee
SyDa
86
27
0
08 Sep 2021
How much pretraining data do language models need to learn syntax?
Laura Pérez-Mayos
Miguel Ballesteros
Leo Wanner
55
32
0
07 Sep 2021
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models
Go Inoue
Bashar Alhafni
Nurpeiis Baimukan
Houda Bouamor
Nizar Habash
107
236
0
11 Mar 2021
Pre-Training a Language Model Without Human Language
Cheng-Han Chiang
Hung-yi Lee
71
13
0
22 Dec 2020
When Do You Need Billions of Words of Pretraining Data?
Yian Zhang
Alex Warstadt
Haau-Sing Li
Samuel R. Bowman
73
141
0
10 Nov 2020
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
222
170
0
24 Oct 2020
1