ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.03813
  4. Cited By
On the importance of pre-training data volume for compact language
  models
v1v2 (latest)

On the importance of pre-training data volume for compact language models

8 October 2020
Vincent Micheli
Martin d'Hoffschmidt
Franccois Fleuret
ArXiv (abs)PDFHTML

Papers citing "On the importance of pre-training data volume for compact language models"

20 / 20 papers shown
Title
Spontaneous Speech Variables for Evaluating LLMs Cognitive Plausibility
Spontaneous Speech Variables for Evaluating LLMs Cognitive Plausibility
Sheng-Fu Wang
Laurent Prevot
Jou-an Chi
Ri-Sheng Huang
Shu-Kai Hsieh
LRM
65
0
0
22 May 2025
TAROT: A Hierarchical Framework with Multitask Co-Pretraining on
  Semi-Structured Data towards Effective Person-Job Fit
TAROT: A Hierarchical Framework with Multitask Co-Pretraining on Semi-Structured Data towards Effective Person-Job Fit
Yihan Cao
Xu Chen
Lun Du
Hao Chen
Qiang Fu
Shi Han
Yushu Du
Yanbin Kang
Guangming Lu
Zi Li
54
1
0
15 Jan 2024
Syntactic Inductive Bias in Transformer Language Models: Especially
  Helpful for Low-Resource Languages?
Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages?
Luke Gessler
Nathan Schneider
46
1
0
01 Nov 2023
Mean BERTs make erratic language teachers: the effectiveness of latent
  bootstrapping in low-resource settings
Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
David Samuel
54
4
0
30 Oct 2023
Data-Efficient French Language Modeling with CamemBERTa
Data-Efficient French Language Modeling with CamemBERTa
Wissam Antoun
Benoît Sagot
Djamé Seddah
52
7
0
02 Jun 2023
Trained on 100 million words and still in shape: BERT meets British
  National Corpus
Trained on 100 million words and still in shape: BERT meets British National Corpus
David Samuel
Andrey Kutuzov
Lilja Øvrelid
Erik Velldal
101
32
0
17 Mar 2023
MicroBERT: Effective Training of Low-resource Monolingual BERTs through
  Parameter Reduction and Multitask Learning
MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning
Luke Gessler
Amir Zeldes
79
14
0
23 Dec 2022
Feature-Level Debiased Natural Language Understanding
Feature-Level Debiased Natural Language Understanding
Yougang Lyu
Piji Li
Yechang Yang
Maarten de Rijke
Pengjie Ren
Yukun Zhao
D. Yin
Zhaochun Ren
91
12
0
11 Dec 2022
Benchmarking Transformers-based models on French Spoken Language
  Understanding tasks
Benchmarking Transformers-based models on French Spoken Language Understanding tasks
Oralie Cattan
Sahar Ghannay
Christophe Servan
Sophie Rosset
49
4
0
19 Jul 2022
On the Effect of Pretraining Corpora on In-context Learning by a
  Large-scale Language Model
On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model
Seongjin Shin
Sang-Woo Lee
Hwijeen Ahn
Sungdong Kim
Hyoungseok Kim
...
Kyunghyun Cho
Gichang Lee
W. Park
Jung-Woo Ha
Nako Sung
LRM
117
97
0
28 Apr 2022
Match the Script, Adapt if Multilingual: Analyzing the Effect of
  Multilingual Pretraining on Cross-lingual Transferability
Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability
Yoshinari Fujinuma
Jordan L. Boyd-Graber
Katharina Kann
AAML
143
26
0
21 Mar 2022
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
Arij Riabi
Benoît Sagot
Djamé Seddah
85
15
0
26 Oct 2021
FQuAD2.0: French Question Answering and knowing that you know nothing
FQuAD2.0: French Question Answering and knowing that you know nothing
Quentin Heinrich
Gautier Viaud
Wacim Belblidia
66
8
0
27 Sep 2021
Legal Transformer Models May Not Always Help
Legal Transformer Models May Not Always Help
Sakbo Geng
R. Lebret
Karl Aberer
VLMAILawELM
53
12
0
14 Sep 2021
On the Transferability of Pre-trained Language Models: A Study from
  Artificial Datasets
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets
Cheng-Han Chiang
Hung-yi Lee
SyDa
86
27
0
08 Sep 2021
How much pretraining data do language models need to learn syntax?
How much pretraining data do language models need to learn syntax?
Laura Pérez-Mayos
Miguel Ballesteros
Leo Wanner
55
32
0
07 Sep 2021
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained
  Language Models
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models
Go Inoue
Bashar Alhafni
Nurpeiis Baimukan
Houda Bouamor
Nizar Habash
107
236
0
11 Mar 2021
Pre-Training a Language Model Without Human Language
Pre-Training a Language Model Without Human Language
Cheng-Han Chiang
Hung-yi Lee
71
13
0
22 Dec 2020
When Do You Need Billions of Words of Pretraining Data?
When Do You Need Billions of Words of Pretraining Data?
Yian Zhang
Alex Warstadt
Haau-Sing Li
Samuel R. Bowman
73
141
0
10 Nov 2020
When Being Unseen from mBERT is just the Beginning: Handling New
  Languages With Multilingual Language Models
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
222
170
0
24 Oct 2020
1