Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.08005
Cited By
Should You Mask 15% in Masked Language Modeling?
16 February 2022
Alexander Wettig
Tianyu Gao
Zexuan Zhong
Danqi Chen
CVBM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Should You Mask 15% in Masked Language Modeling?"
30 / 30 papers shown
Title
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
22
0
0
05 May 2025
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
38
106
0
10 Apr 2025
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
79
1
0
07 Mar 2025
Sequence-level Large Language Model Training with Contrastive Preference Optimization
Zhili Feng
Dhananjay Ram
Cole Hawkins
Aditya Rawal
Jinman Zhao
Sheng Zha
57
0
0
23 Feb 2025
Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text
Andrei Jarca
Florinel-Alin Croitoru
Radu Tudor Ionescu
44
0
0
18 Feb 2025
An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them
Creston Brooks
J. Haubold
Charlie Cowen-Breen
Jay White
Desmond DeVaul
Frederick Riemenschneider
Karthik Narasimhan
B. Graziosi
18
0
0
14 Oct 2024
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
Rheeya Uppaal
Yixuan Li
Junjie Hu
32
4
0
31 Jan 2024
Leveraging Domain Adaptation and Data Augmentation to Improve Quránic IR in English and Arabic
Vera Pavlova
16
2
0
05 Dec 2023
netFound: Foundation Model for Network Security
Satyandra Guthula
Navya Battula
Roman Beltiukov
Wenbo Guo
Arpit Gupta
Inder Monga
21
13
0
25 Oct 2023
AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuning
John Bosco Mugeni
S. Lynden
Toshiyuki Amagasa
Akiyoshi Matono
10
2
0
30 May 2023
Privacy-Preserving Prompt Tuning for Large Language Model Services
Yansong Li
Zhixing Tan
Yang Liu
SILM
VLM
43
63
0
10 May 2023
UnICLAM:Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering
Chenlu Zhan
Peng Peng
Hongsen Wang
Tao Chen
Hongwei Wang
MedIm
8
3
0
21 Dec 2022
Query-as-context Pre-training for Dense Passage Retrieval
Xing Wu
Guangyuan Ma
Wanhui Qian
Zijia Lin
Songlin Hu
15
9
0
19 Dec 2022
Uniform Masking Prevails in Vision-Language Pretraining
Siddharth Verma
Yuchen Lu
Rui Hou
Hanchao Yu
Nicolas Ballas
Madian Khabsa
Amjad Almahairi
VLM
21
0
0
10 Dec 2022
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction
Kai Shen
Yichong Leng
Xuejiao Tan
Si-Qi Tang
Yuan Zhang
Wenjie Liu
Ed Lin
22
13
0
23 Nov 2022
Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token
Baohao Liao
David Thulke
Sanjika Hewavitharana
Hermann Ney
Christof Monz
22
9
0
09 Nov 2022
Word Order Matters when you Increase Masking
Karim Lasri
Alessandro Lenci
Thierry Poibeau
28
7
0
08 Nov 2022
InforMask: Unsupervised Informative Masking for Language Model Pretraining
Nafis Sadeq
Canwen Xu
Julian McAuley
19
13
0
21 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
Learning Better Masking for Better Language Model Pre-training
Dongjie Yang
Zhuosheng Zhang
Hai Zhao
16
15
0
23 Aug 2022
ConTextual Masked Auto-Encoder for Dense Passage Retrieval
Xing Wu
Guangyuan Ma
Meng Lin
Zijia Lin
Zhongyuan Wang
Songlin Hu
RALM
19
24
0
16 Aug 2022
Masked Vision and Language Modeling for Multi-modal Representation Learning
Gukyeong Kwon
Zhaowei Cai
Avinash Ravichandran
Erhan Bas
Rahul Bhotika
Stefano Soatto
22
67
0
03 Aug 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
42
70
0
30 Jul 2022
UL2: Unifying Language Learning Paradigms
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
Xavier Garcia
Jason W. Wei
...
Tal Schuster
H. Zheng
Denny Zhou
N. Houlsby
Donald Metzler
AI4CE
33
293
0
10 May 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
6
55
0
06 Apr 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,815
0
17 Sep 2019
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita
Rico Sennrich
Ivan Titov
188
181
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
1