ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.09733
  4. Cited By
Train No Evil: Selective Masking for Task-Guided Pre-Training
v1v2 (latest)

Train No Evil: Selective Masking for Task-Guided Pre-Training

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
21 April 2020
Yuxian Gu
Zhengyan Zhang
Xiaozhi Wang
Zhiyuan Liu
Maosong Sun
ArXiv (abs)PDFHTMLGithub (71★)

Papers citing "Train No Evil: Selective Masking for Task-Guided Pre-Training"

36 / 36 papers shown
ChemFixer: Correcting Invalid Molecules to Unlock Previously Unseen Chemical Space
ChemFixer: Correcting Invalid Molecules to Unlock Previously Unseen Chemical SpaceIEEE journal of biomedical and health informatics (JBHI), 2025
Jun-Hyoung Park
Ho-Jun Song
Seong-Whan Lee
62
1
0
14 Nov 2025
MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning
MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning
Vera Pavlova
Mohammed Makhlouf
CLL
222
0
0
19 Oct 2025
TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA
TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA
Chanjoo Jung
Jaehyung Kim
205
0
0
06 Oct 2025
The Efficiency of Pre-training with Objective Masking in Pseudo Labeling for Semi-Supervised Text Classification
The Efficiency of Pre-training with Objective Masking in Pseudo Labeling for Semi-Supervised Text Classification
Arezoo Hatefi
Xuan-Son Vu
Monowar Bhuyan
Frank Drewes
VLM
305
0
0
10 May 2025
CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search
CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial SearchNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Hai Ye
Yixin Ji
Ziyang Chen
Qiang Wang
Cunxiang Wang
...
Jia Xu
Zhongyi Liu
Jinjie Gu
Yuan Zhou
Linjian Mo
KELMCLL
763
1
0
02 Dec 2024
KidLM: Advancing Language Models for Children -- Early Insights and
  Future Directions
KidLM: Advancing Language Models for Children -- Early Insights and Future DirectionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Mir Tafseer Nayeem
Davood Rafiei
ALM
337
12
0
04 Oct 2024
Rho-1: Not All Tokens Are What You Need
Rho-1: Not All Tokens Are What You Need
Zheng-Wen Lin
Zhibin Gou
Yeyun Gong
Xiao Liu
Haoran Pan
...
Chen Lin
Yujiu Yang
Jian Jiao
Nan Duan
Weizhu Chen
CLL
476
123
0
11 Apr 2024
Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for
  Pretraining on the Cybersecurity Domain
Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for Pretraining on the Cybersecurity Domain
Eugene Jang
Jian Cui
Dayeon Yim
Youngjin Jin
Jin-Woo Chung
Seung-Eui Shin
Yongjae Lee
395
8
0
15 Mar 2024
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
Rheeya Uppaal
Yixuan Li
Junjie Hu
532
7
0
31 Jan 2024
EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models
  with Semi-structured Data
EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models with Semi-structured Data
Shirong Ma
Shen Huang
Shulin Huang
Xiaobin Wang
Yangning Li
Hai-Tao Zheng
Pengjun Xie
Fei Huang
Yong Jiang
393
7
0
25 Dec 2023
Lil-Bevo: Explorations of Strategies for Training Language Models in
  More Humanlike Ways
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways
Venkata S Govindarajan
Juan Diego Rodriguez
Kaj Bostrom
Kyle Mahowald
391
1
0
26 Oct 2023
An Anchor Learning Approach for Citation Field Learning
An Anchor Learning Approach for Citation Field LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zilin Yuan
Borun Chen
Yimeng Dai
Hai-Tao Zheng
Hai-Tao Zheng
Rui Zhang
259
0
0
07 Sep 2023
BIOptimus: Pre-training an Optimal Biomedical Language Model with
  Curriculum Learning for Named Entity Recognition
BIOptimus: Pre-training an Optimal Biomedical Language Model with Curriculum Learning for Named Entity RecognitionWorkshop on Biomedical Natural Language Processing (BioNLP), 2023
Vera Pavlova
M. Makhlouf
273
3
0
16 Aug 2023
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking
  In-domain Keywords
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain KeywordsWorkshop on Representation Learning for NLP (RepL4NLP), 2023
Shahriar Golchin
Mihai Surdeanu
N. Tavabi
A. Kiapour
197
7
0
14 Jul 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Difference-Masking: Choosing What to Mask in Continued PretrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLLSSL
382
8
0
23 May 2023
Teaching the Pre-trained Model to Generate Simple Texts for Text
  Simplification
Teaching the Pre-trained Model to Generate Simple Texts for Text SimplificationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Renliang Sun
Wei Xu
Xiaojun Wan
CLL
280
26
0
21 May 2023
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models
  for Logical Reasoning
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Soumya Sanyal
Yichong Xu
Shuohang Wang
Ziyi Yang
Reid Pryzant
Wenhao Yu
Chenguang Zhu
Xiang Ren
ReLMLRM
383
12
0
19 Dec 2022
Continual Knowledge Distillation for Neural Machine Translation
Continual Knowledge Distillation for Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Yuan Zhang
Peng Li
Maosong Sun
Yang Liu
FedMLCLL
327
7
0
18 Dec 2022
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Tanish Lad
Himanshu Maheshwari
Shreyas Kottukkal
R. Mamidi
169
3
0
24 Nov 2022
Scheduled Multi-task Learning for Neural Chat Translation
Scheduled Multi-task Learning for Neural Chat TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Yunlong Liang
Fandong Meng
Jinan Xu
Jinan Xu
Jie Zhou
336
15
0
08 May 2022
Knowledgeable Salient Span Mask for Enhancing Language Models as
  Knowledge Base
Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge BaseNatural Language Processing and Chinese Computing (NLPCC), 2022
Cunxiang Wang
Fuli Luo
Yanyang Li
Runxin Xu
Fei Huang
Yue Zhang
KELM
214
3
0
17 Apr 2022
SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words
SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words
Renliang Sun
Xiaojun Wan
CLL
230
2
0
16 Apr 2022
A Survey on Dropout Methods and Experimental Verification in
  Recommendation
A Survey on Dropout Methods and Experimental Verification in RecommendationIEEE Transactions on Knowledge and Data Engineering (TKDE), 2022
Yongqian Li
Weizhi Ma
C. L. Philip Chen
Hao Fei
Yiqun Liu
Shaoping Ma
Yue Yang
376
20
0
05 Apr 2022
Task-guided Disentangled Tuning for Pretrained Language Models
Task-guided Disentangled Tuning for Pretrained Language ModelsFindings (Findings), 2022
Jiali Zeng
Yu Jiang
Shuangzhi Wu
Yongjing Yin
Mu Li
DRL
442
3
0
22 Mar 2022
KESA: A Knowledge Enhanced Approach For Sentiment Analysis
KESA: A Knowledge Enhanced Approach For Sentiment Analysis
Qinghua Zhao
Shuai Ma
Shuo Ren
VLM
116
1
0
24 Feb 2022
Fortunately, Discourse Markers Can Enhance Language Models for Sentiment
  Analysis
Fortunately, Discourse Markers Can Enhance Language Models for Sentiment AnalysisAAAI Conference on Artificial Intelligence (AAAI), 2022
L. Ein-Dor
Ilya Shnayderman
Artem Spector
Lena Dankin
R. Aharonov
Noam Slonim
255
9
0
06 Jan 2022
MoCA: Incorporating Multi-stage Domain Pretraining and Cross-guided
  Multimodal Attention for Textbook Question Answering
MoCA: Incorporating Multi-stage Domain Pretraining and Cross-guided Multimodal Attention for Textbook Question Answering
Fangzhi Xu
Qika Lin
Jing Liu
Lingling Zhang
Tianzhe Zhao
Qianyi Chai
Yudai Pan
208
2
0
06 Dec 2021
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient
  Framework
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework
Xingcheng Yao
Yanan Zheng
Xiaocong Yang
Zhilin Yang
346
50
0
07 Nov 2021
Improving Social Meaning Detection with Pragmatic Masking and Surrogate
  Fine-Tuning
Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning
Chiyu Zhang
Muhammad Abdul-Mageed
ObjDAI4CE
486
6
0
01 Aug 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and FutureAI Open (AO), 2021
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFinMQAI4MH
485
1,040
0
14 Jun 2021
CLEVE: Contrastive Pre-training for Event Extraction
CLEVE: Contrastive Pre-training for Event ExtractionAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Ziqi Wang
Xiaozhi Wang
Xu Han
Yankai Lin
Lei Hou
Zhiyuan Liu
Peng Li
Juan-Zi Li
Jie Zhou
258
129
0
30 May 2021
On the Influence of Masking Policies in Intermediate Pre-training
On the Influence of Masking Policies in Intermediate Pre-trainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Anuj Kumar
Xiang Ren
Madian Khabsa
265
13
0
18 Apr 2021
CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of
  Pre-trained Language Models
CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Yusheng Su
Xu Han
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Peng Li
Jie Zhou
Maosong Sun
222
12
0
07 Feb 2021
Studying Strategically: Learning to Mask for Closed-book QA
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Anuj Kumar
Xiang Ren
Madian Khabsa
OffRL
347
12
0
31 Dec 2020
Improving Self-supervised Pre-training via a Fully-Explored Masked
  Language Model
Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model
Ming Zheng
Dinghan Shen
Yelong Shen
Weizhu Chen
Lin Xiao
SSL
285
4
0
12 Oct 2020
Improving Low Compute Language Modeling with In-Domain Embedding
  Initialisation
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation
Charles F Welch
Amélie Reymond
Jonathan K. Kummerfeld
AI4CE
286
5
0
29 Sep 2020
1
Page 1 of 1