ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.01819
  4. Cited By
Frustratingly Simple Pretraining Alternatives to Masked Language
  Modeling

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

4 September 2021
Atsuki Yamaguchi
G. Chrysostomou
Katerina Margatina
Nikolaos Aletras
ArXivPDFHTML

Papers citing "Frustratingly Simple Pretraining Alternatives to Masked Language Modeling"

16 / 16 papers shown
Title
Linguistic Blind Spots of Large Language Models
Linguistic Blind Spots of Large Language Models
Jiali Cheng
Hadi Amiri
52
1
0
25 Mar 2025
Recent Advances in Generative AI and Large Language Models: Current
  Status, Challenges, and Perspectives
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
25
22
0
20 Jul 2024
Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for
  Pretraining on the Cybersecurity Domain
Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for Pretraining on the Cybersecurity Domain
Eugene Jang
Jian Cui
Dayeon Yim
Youngjin Jin
Jin-Woo Chung
Seung-Eui Shin
Yongjae Lee
57
2
0
15 Mar 2024
Understanding the Role of Input Token Characters in Language Models: How
  Does Information Loss Affect Performance?
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
19
1
0
26 Oct 2023
BIOptimus: Pre-training an Optimal Biomedical Language Model with
  Curriculum Learning for Named Entity Recognition
BIOptimus: Pre-training an Optimal Biomedical Language Model with Curriculum Learning for Named Entity Recognition
Vera Pavlova
M. Makhlouf
19
3
0
16 Aug 2023
GeneMask: Fast Pretraining of Gene Sequences to Enable Few-Shot Learning
GeneMask: Fast Pretraining of Gene Sequences to Enable Few-Shot Learning
Soumyadeep Roy
Jonas Wallat
Sowmya S. Sundaram
Wolfgang Nejdl
Niloy Ganguly
20
3
0
29 Jul 2023
How does the task complexity of masked pretraining objectives affect
  downstream performance?
How does the task complexity of masked pretraining objectives affect downstream performance?
Atsuki Yamaguchi
Hiroaki Ozaki
Terufumi Morishita
Gaku Morio
Yasuhiro Sogawa
30
2
0
18 May 2023
Bag of Tricks for Effective Language Model Pretraining and Downstream
  Adaptation: A Case Study on GLUE
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE
Qihuang Zhong
Liang Ding
Keqin Peng
Juhua Liu
Bo Du
Li Shen
Yibing Zhan
Dacheng Tao
VLM
39
13
0
18 Feb 2023
ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for
  E-Commerce Product Search
ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for E-Commerce Product Search
Xuange Cui
Wei Xiong
Songlin Wang
32
1
0
31 Jan 2023
Language Model Pre-training on True Negatives
Language Model Pre-training on True Negatives
Zhuosheng Zhang
Hai Zhao
Masao Utiyama
Eiichiro Sumita
27
2
0
01 Dec 2022
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
Huiyin Xue
Nikolaos Aletras
17
4
0
14 Oct 2022
Instance Regularization for Discriminative Language Model Pre-training
Instance Regularization for Discriminative Language Model Pre-training
Zhuosheng Zhang
Hai Zhao
M. Zhou
11
1
0
11 Oct 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language
  Understanding and Generation
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
46
27
0
30 May 2022
How does the pre-training objective affect what large language models
  learn about linguistic properties?
How does the pre-training objective affect what large language models learn about linguistic properties?
Ahmed Alajrami
Nikolaos Aletras
26
20
0
20 Mar 2022
Should You Mask 15% in Masked Language Modeling?
Should You Mask 15% in Masked Language Modeling?
Alexander Wettig
Tianyu Gao
Zexuan Zhong
Danqi Chen
CVBM
29
161
0
16 Feb 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1