ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
v1v2v3 (latest)

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLMAI4CECLL
ArXiv (abs)PDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 1,369 papers shown
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language
  Model for Reading Comprehension of Abstract Meaning
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language Model for Reading Comprehension of Abstract MeaningInternational Workshop on Semantic Evaluation (SemEval), 2021
Xin Xie
Xiangnan Chen
Xiang Chen
Yong Wang
Ningyu Zhang
Shumin Deng
Huajun Chen
190
2
0
25 Feb 2021
BERT-based Acronym Disambiguation with Multiple Training Strategies
BERT-based Acronym Disambiguation with Multiple Training Strategies
Chunguang Pan
Bingyan Song
Shengguang Wang
Zhipeng Luo
165
19
0
25 Feb 2021
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
LogME: Practical Assessment of Pre-trained Models for Transfer LearningInternational Conference on Machine Learning (ICML), 2021
Kaichao You
Yong Liu
Jianmin Wang
Mingsheng Long
304
231
0
22 Feb 2021
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual
  Matching Tasks
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching TasksThe Web Conference (WWW), 2021
Tingyu Xia
Yue Wang
Yuan Tian
Yi-Ju Chang
136
55
0
22 Feb 2021
An Empirical Study on Measuring the Similarity of Sentential Arguments
  with Language Model Domain Adaptation
An Empirical Study on Measuring the Similarity of Sentential Arguments with Language Model Domain Adaptation
Yujin Baek
Sang-gyu Seo
94
0
0
19 Feb 2021
Boosting Low-Resource Biomedical QA via Entity-Aware Masking Strategies
Boosting Low-Resource Biomedical QA via Entity-Aware Masking StrategiesConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Gabriele Pergola
E. Kochkina
Lin Gui
Maria Liakata
Yulan He
261
33
0
16 Feb 2021
Domain Adaptation for Time Series Forecasting via Attention Sharing
Domain Adaptation for Time Series Forecasting via Attention SharingInternational Conference on Machine Learning (ICML), 2021
Xiaoyong Jin
Youngsuk Park
Danielle C. Maddix
Bernie Wang
Xifeng Yan
TTAOODAI4TS
667
105
0
13 Feb 2021
Characterizing English Variation across Social Media Communities with
  BERT
Characterizing English Variation across Social Media Communities with BERTTransactions of the Association for Computational Linguistics (TACL), 2021
L. Lucy
David Bamman
194
41
0
12 Feb 2021
Text Compression-aided Transformer Encoding
Text Compression-aided Transformer EncodingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Z. Li
Zhuosheng Zhang
Hai Zhao
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
AI4CE
126
47
0
11 Feb 2021
A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive
  Pretraining
A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive Pretraining
Boliang Zhang
Ying Lyu
Ning Ding
Shangda Wu
Zhaoyang Jia
Kun Han
Kevin Knight
VLM
117
5
0
08 Feb 2021
Clinical Outcome Prediction from Admission Notes using Self-Supervised
  Knowledge Integration
Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge IntegrationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Betty van Aken
Jens-Michalis Papaioannou
M. Mayrdorfer
Klemens Budde
Felix Alexander Gers
Alexander Loser
126
84
0
08 Feb 2021
CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of
  Pre-trained Language Models
CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Yusheng Su
Xu Han
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Peng Li
Jie Zhou
Maosong Sun
169
12
0
07 Feb 2021
Mind the Gap: Assessing Temporal Generalization in Neural Language
  Models
Mind the Gap: Assessing Temporal Generalization in Neural Language ModelsNeural Information Processing Systems (NeurIPS), 2021
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
...
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
VLM
409
250
0
03 Feb 2021
AutoFreeze: Automatically Freezing Model Blocks to Accelerate
  Fine-tuning
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Yuhan Liu
Saurabh Agarwal
Shivaram Venkataraman
OffRL
236
71
0
02 Feb 2021
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of
  Multilingual BERT models for Offensive Language Identification
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification
Sai Muralidhar Jayanthi
Akshat Gupta
VLM
129
35
0
01 Feb 2021
"Laughing at you or with you": The Role of Sarcasm in Shaping the
  Disagreement Space
"Laughing at you or with you": The Role of Sarcasm in Shaping the Disagreement SpaceConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Debanjan Ghosh
Ritvik Shrivastava
Smaranda Muresan
78
15
0
26 Jan 2021
Word Alignment by Fine-tuning Embeddings on Parallel Corpora
Word Alignment by Fine-tuning Embeddings on Parallel CorporaConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Zi-Yi Dou
Graham Neubig
494
295
0
20 Jan 2021
Task Adaptive Pretraining of Transformers for Hostility Detection
Task Adaptive Pretraining of Transformers for Hostility Detection
Tathagata Raha
Sayar Ghosh Roy
Ujwal Narayan
Zubair Abid
Vasudeva Varma
192
9
0
09 Jan 2021
Studying Strategically: Learning to Mask for Closed-book QA
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Anuj Kumar
Xiang Ren
Madian Khabsa
OffRL
256
12
0
31 Dec 2020
Promoting Graph Awareness in Linearized Graph-to-Text Generation
Promoting Graph Awareness in Linearized Graph-to-Text GenerationFindings (Findings), 2020
Alexander Miserlis Hoyle
Ana Marasović
Noah A. Smith
AI4CE
169
32
0
31 Dec 2020
CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse
  Relations
CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse RelationsFindings (Findings), 2020
Changlong Yu
Hongming Zhang
Yangqiu Song
Wilfred Ng
243
23
0
31 Dec 2020
Automated Lay Language Summarization of Biomedical Scientific Reviews
Automated Lay Language Summarization of Biomedical Scientific ReviewsAAAI Conference on Artificial Intelligence (AAAI), 2020
Yue Guo
Weijian Qiu
Yizhong Wang
T. Cohen
375
96
0
23 Dec 2020
Pre-Training a Language Model Without Human Language
Pre-Training a Language Model Without Human Language
Cheng-Han Chiang
Hung-yi Lee
156
13
0
22 Dec 2020
A Graph Reasoning Network for Multi-turn Response Selection via
  Customized Pre-training
A Graph Reasoning Network for Multi-turn Response Selection via Customized Pre-trainingAAAI Conference on Artificial Intelligence (AAAI), 2020
Yongkang Liu
Shi Feng
Daling Wang
Kaisong Song
Feiliang Ren
Yifei Zhang
LRM
121
22
0
21 Dec 2020
Continual Lifelong Learning in Natural Language Processing: A Survey
Continual Lifelong Learning in Natural Language Processing: A SurveyInternational Conference on Computational Linguistics (COLING), 2020
Magdalena Biesialska
Katarzyna Biesialska
Marta R. Costa-jussá
KELMCLL
269
247
0
17 Dec 2020
MELINDA: A Multimodal Dataset for Biomedical Experiment Method
  Classification
MELINDA: A Multimodal Dataset for Biomedical Experiment Method ClassificationAAAI Conference on Artificial Intelligence (AAAI), 2020
Te-Lin Wu
Shikhar Singh
S. Paul
Gully A. Burns
Nanyun Peng
95
21
0
16 Dec 2020
*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional
  Task
*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional TaskAAAI Conference on Artificial Intelligence (AAAI), 2020
Dmitry Tsarkov
Tibor Tihon
Nathan Scales
Nikola Momchev
Danila Sinopalnikov
Nathanael Scharli
153
17
0
15 Dec 2020
User-friendly automatic transcription of low-resource languages:
  Plugging ESPnet into Elpis
User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
Oliver Adams
Benjamin Galliot
Guillaume Wisniewski
Nicholas Lambourne
Ben Foley
...
Laurent Besacier
Christopher Cox
Katya Aplonova
Guillaume Jacques
Nathan W. Hill
231
13
0
15 Dec 2020
Causal BERT : Language models for causality detection between events
  expressed in text
Causal BERT : Language models for causality detection between events expressed in text
Vivek Khetan
Roshni Ramnani
M. Anand
Shubhashis Sengupta
Andrew E.Fano
207
53
0
10 Dec 2020
CrossNER: Evaluating Cross-Domain Named Entity Recognition
CrossNER: Evaluating Cross-Domain Named Entity Recognition
Zihan Liu
Yan Xu
Tiezheng Yu
Wenliang Dai
Ziwei Ji
Samuel Cahyawijaya
Andrea Madotto
Pascale Fung
318
183
0
08 Dec 2020
Pre-training Protein Language Models with Label-Agnostic Binding Pairs
  Enhances Performance in Downstream Tasks
Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks
Modestas Filipavicius
Matteo Manica
Joris Cadow
María Rodríguez Martínez
267
15
0
05 Dec 2020
Context in Informational Bias Detection
Context in Informational Bias DetectionInternational Conference on Computational Linguistics (COLING), 2020
Esther van den Berg
K. Markert
112
22
0
03 Dec 2020
End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training
End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training
R. Reddy
Bhavani Iyer
Md Arafat Sultan
Rong Zhang
Avirup Sil
Vittorio Castelli
Radu Florian
Salim Roukos
OOD
151
19
0
02 Dec 2020
Cross-Domain Generalization Through Memorization: A Study of Nearest
  Neighbors in Neural Duplicate Question Detection
Cross-Domain Generalization Through Memorization: A Study of Nearest Neighbors in Neural Duplicate Question Detection
Yadollah Yaghoobzadeh
Alexandre Rochette
Timothy J. Hazen
OOD
112
1
0
22 Nov 2020
Out-of-Task Training for Dialog State Tracking Models
Out-of-Task Training for Dialog State Tracking ModelsInternational Conference on Computational Linguistics (COLING), 2020
Michael Heck
Carel van Niekerk
Nurul Lubis
Christian Geishauser
Hsien-Chin Lin
Marco Moresi
Milica Gavsić
133
3
0
18 Nov 2020
Predictions For Pre-training Language Models
Predictions For Pre-training Language Models
Tonglei Guo
142
0
0
18 Nov 2020
Neural Semi-supervised Learning for Text Classification Under
  Large-Scale Pretraining
Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining
Zijun Sun
Chun Fan
Xiaofei Sun
Yuxian Meng
Leilei Gan
Jiwei Li
167
9
0
17 Nov 2020
IIT_kgp at FinCausal 2020, Shared Task 1: Causality Detection using
  Sentence Embeddings in Financial Reports
IIT_kgp at FinCausal 2020, Shared Task 1: Causality Detection using Sentence Embeddings in Financial Reports
Arka Mitra
Harshvardhan Srivastava
Yugam Tiwari
74
0
0
16 Nov 2020
NegatER: Unsupervised Discovery of Negatives in Commonsense Knowledge
  Bases
NegatER: Unsupervised Discovery of Negatives in Commonsense Knowledge BasesConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Tara Safavi
Jing Zhu
Danai Koutra
219
13
0
15 Nov 2020
Arabic Dialect Identification Using BERT-Based Domain Adaptation
Arabic Dialect Identification Using BERT-Based Domain AdaptationWorkshop on Arabic Natural Language Processing (WANLP), 2020
A. Beltagy
W. Abdelrahman
Omar ElSherief
125
8
0
13 Nov 2020
Overview of the Ninth Dialog System Technology Challenge: DSTC9
Overview of the Ninth Dialog System Technology Challenge: DSTC9IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Chulaka Gunasekara
Seokhwan Kim
L. F. D’Haro
Abhinav Rastogi
Yun-Nung Chen
...
A. Geramifard
Satwik Kottur
Seungwhan Moon
Shivani Poddar
R. Subba
271
72
0
12 Nov 2020
EXAMS: A Multi-Subject High School Examinations Dataset for
  Cross-Lingual and Multilingual Question Answering
EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering
Momchil Hardalov
Todor Mihaylov
Dimitrina Zlatkova
Yoan Dinkov
Ivan Koychev
Preslav Nakov
AI4EdELM
570
70
0
05 Nov 2020
CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web
  to Special Domain Search
CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search
Chenyan Xiong
Zhenghao Liu
Si Sun
Zhuyun Dai
Kaitao Zhang
S. Yu
Zhiyuan Liu
Hoifung Poon
Jianfeng Gao
Paul N. Bennett
115
13
0
03 Nov 2020
Improving Dialogue Breakdown Detection with Semi-Supervised Learning
Improving Dialogue Breakdown Detection with Semi-Supervised Learning
Nathan Ng
Marzyeh Ghassemi
Narendran Thangarajan
Jiacheng Pan
Qi Guo
156
9
0
30 Oct 2020
Predicting Themes within Complex Unstructured Texts: A Case Study on
  Safeguarding Reports
Predicting Themes within Complex Unstructured Texts: A Case Study on Safeguarding Reports
A. Edwards
David Rogers
Jose Camacho-Collados
Hélène de Ribaupierre
Alun D. Preece
222
1
0
27 Oct 2020
WNUT-2020 Task 1 Overview: Extracting Entities and Relations from Wet
  Lab Protocols
WNUT-2020 Task 1 Overview: Extracting Entities and Relations from Wet Lab Protocols
Jeniya Tabassum
Sydney Lee
Wei Xu
Alan Ritter
228
18
0
27 Oct 2020
Unmasking Contextual Stereotypes: Measuring and Mitigating BERT's Gender
  Bias
Unmasking Contextual Stereotypes: Measuring and Mitigating BERT's Gender Bias
Marion Bartl
Malvina Nissim
Albert Gatt
239
148
0
27 Oct 2020
To BERT or Not to BERT: Comparing Task-specific and Task-agnostic
  Semi-Supervised Approaches for Sequence Tagging
To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence TaggingConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Kasturi Bhattacharjee
Miguel Ballesteros
Rishita Anubhai
Smaranda Muresan
Jie Ma
Faisal Ladhak
Yaser Al-Onaizan
SSL
165
20
0
27 Oct 2020
Unsupervised Paraphrasing with Pretrained Language Models
Unsupervised Paraphrasing with Pretrained Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Tong Niu
Semih Yavuz
Yingbo Zhou
N. Keskar
Huan Wang
Caiming Xiong
LRMSyDa
125
33
0
24 Oct 2020
Rethinking embedding coupling in pre-trained language models
Rethinking embedding coupling in pre-trained language modelsInternational Conference on Learning Representations (ICLR), 2020
Hyung Won Chung
Thibault Févry
Henry Tsai
Melvin Johnson
Sebastian Ruder
303
167
0
24 Oct 2020
Previous
123...25262728
Next