ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLM
    AI4CE
    CLL
ArXivPDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 352 papers shown
Title
Second Thoughts are Best: Learning to Re-Align With Human Values from
  Text Edits
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
90
34
0
01 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
18
7
0
31 Dec 2022
Towards Proactively Forecasting Sentence-Specific Information Popularity
  within Online News Documents
Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents
Sayar Ghosh Roy
Anshul Padhi
Risubh Jain
Manish Gupta
Vasudeva Varma
AI4TS
20
2
0
31 Dec 2022
Continual Contrastive Finetuning Improves Low-Resource Relation
  Extraction
Continual Contrastive Finetuning Improves Low-Resource Relation Extraction
Wenxuan Zhou
Sheng Zhang
Tristan Naumann
Muhao Chen
Hoifung Poon
43
6
0
21 Dec 2022
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su
Weijia Shi
Jungo Kasai
Yizhong Wang
Yushi Hu
Mari Ostendorf
Wen-tau Yih
Noah A. Smith
Luke Zettlemoyer
Tao Yu
25
278
0
19 Dec 2022
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text
  Generation
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation
Yuxi Feng
Xiaoyuan Yi
Xiting Wang
L. Lakshmanan
Xing Xie
DiffM
27
5
0
16 Dec 2022
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain
  Tasks
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
Zhongwei Wan
Yichun Yin
Wei Zhang
Jiaxin Shi
Lifeng Shang
Guangyong Chen
Xin Jiang
Qun Liu
VLM
CLL
26
16
0
07 Dec 2022
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Hamish Ivison
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
31
19
0
01 Dec 2022
ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data
  Format
ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format
Qi Zhu
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Baolin Peng
...
Dazhen Wan
Xiaochen Zhu
Jianfeng Gao
Milica Gavsić
Minlie Huang
43
23
0
30 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image
  Models
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
27
2
0
27 Nov 2022
Detecting Entities in the Astrophysics Literature: A Comparison of
  Word-based and Span-based Entity Recognition Methods
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai
Sarvnaz Karimi
8
3
0
24 Nov 2022
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Tanish Lad
Himanshu Maheshwari
Shreyas Kottukkal
R. Mamidi
19
3
0
24 Nov 2022
Continual Learning of Natural Language Processing Tasks: A Survey
Continual Learning of Natural Language Processing Tasks: A Survey
Zixuan Ke
Bin Liu
KELM
CLL
VLM
19
68
0
23 Nov 2022
Unsupervised Domain Adaptation for Sparse Retrieval by Filling
  Vocabulary and Word Frequency Gaps
Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps
Hiroki Iida
Naoaki Okazaki
34
4
0
08 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken
  Language Understanding via Phoneme level T5
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
Chan-Jan Hsu
Ho-Lam Chung
Hung-yi Lee
Yu Tsao
19
6
0
01 Nov 2022
WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model
  for Financial Domain
WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain
Raj Sanjay Shah
Kunal Chawla
Dheeraj Eidnani
Agam Shah
Wendi Du
S. Chava
Natraj Raman
Charese Smiley
Jiaao Chen
Diyi Yang
AIFin
24
103
0
31 Oct 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for
  Text Generation and Modular Control
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
30
79
0
31 Oct 2022
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with
  Contrastive and Distributionally Robust Learning
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
Yue Yu
Chenyan Xiong
Si Sun
Chao Zhang
Arnold Overwijk
VLM
OOD
35
22
0
27 Oct 2022
Learning on Large-scale Text-attributed Graphs via Variational Inference
Learning on Large-scale Text-attributed Graphs via Variational Inference
Jianan Zhao
Meng Qu
Chaozhuo Li
Hao Yan
Qian Liu
Rui Li
Xing Xie
Jian Tang
VLM
17
131
0
26 Oct 2022
Predicting Long-Term Citations from Short-Term Linguistic Influence
Predicting Long-Term Citations from Short-Term Linguistic Influence
Sandeep Soni
David Bamman
Jacob Eisenstein
15
2
0
24 Oct 2022
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer
  Data Augmentation
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation
Phillip Howard
Gadi Singer
Vasudev Lal
Yejin Choi
Swabha Swayamdipta
CML
50
25
0
22 Oct 2022
Table-To-Text generation and pre-training with TabT5
Table-To-Text generation and pre-training with TabT5
Ewa Andrejczuk
Julian Martin Eisenschlos
Francesco Piccinno
Syrine Krichene
Yasemin Altun
LMTD
15
30
0
17 Oct 2022
Improving generalizability of distilled self-supervised speech
  processing models under distorted settings
Improving generalizability of distilled self-supervised speech processing models under distorted settings
Kuan-Po Huang
Yu-Kuan Fu
Tsung-Yuan Hsu
Fabian Ritter Gutierrez
Fan Wang
Liang-Hsuan Tseng
Yu Zhang
Hung-yi Lee
24
13
0
14 Oct 2022
Self-Adaptive Named Entity Recognition by Retrieving Unstructured
  Knowledge
Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge
Kosuke Nishida
Naoki Yoshinaga
Kyosuke Nishida
22
2
0
14 Oct 2022
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain
Amir Hadifar
Semere Kiros Bitew
Johannes Deleu
Chris Develder
Thomas Demeester
AI4Ed
21
18
0
12 Oct 2022
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU
  Tasks
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks
Charith Peris
Lizhen Tan
Thomas Gueudré
Turan Gojayev
Vivi Wei
Gokmen Oz
22
4
0
10 Oct 2022
Leveraging Key Information Modeling to Improve Less-Data Constrained
  News Headline Generation via Duality Fine-Tuning
Leveraging Key Information Modeling to Improve Less-Data Constrained News Headline Generation via Duality Fine-Tuning
Zhuoxuan Jiang
Lingfeng Qiao
Di Yin
Shanshan Feng
Bo Ren
SyDa
28
2
0
10 Oct 2022
On Task-Adaptive Pretraining for Dialogue Response Selection
On Task-Adaptive Pretraining for Dialogue Response Selection
Tzu-Hsiang Lin
Ta-Chung Chi
Anna Rumshisky
11
1
0
08 Oct 2022
Short Text Pre-training with Extended Token Classification for
  E-commerce Query Understanding
Short Text Pre-training with Extended Token Classification for E-commerce Query Understanding
Haoming Jiang
Tianyu Cao
Zheng Li
Cheng-hsin Luo
Xianfeng Tang
Qingyu Yin
Danqing Zhang
R. Goutam
Bing Yin
RALM
16
11
0
08 Oct 2022
Calibrating Factual Knowledge in Pretrained Language Models
Calibrating Factual Knowledge in Pretrained Language Models
Qingxiu Dong
Damai Dai
Yifan Song
Jingjing Xu
Zhifang Sui
Lei Li
KELM
228
82
0
07 Oct 2022
SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis
SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis
Jiaxin Pei
Vítor Silva
Maarten W. Bos
Yozon Liu
Leonardo Neves
David Jurgens
Francesco Barbieri
53
28
0
03 Oct 2022
DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language
  Processing
DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing
Yanjun Gao
Dmitriy Dligach
Timothy A. Miller
John R. Caskey
Brihat Sharma
M. Churpek
Majid Afshar
ELM
LRM
32
16
0
29 Sep 2022
Generalizing through Forgetting -- Domain Generalization for Symptom
  Event Extraction in Clinical Notes
Generalizing through Forgetting -- Domain Generalization for Symptom Event Extraction in Clinical Notes
Sitong Zhou
K. Lybarger
Meliha Yetisgen-Yildiz
Mari Ostendorf
32
2
0
20 Sep 2022
External Knowledge Selection with Weighted Negative Sampling in
  Knowledge-grounded Task-oriented Dialogue Systems
External Knowledge Selection with Weighted Negative Sampling in Knowledge-grounded Task-oriented Dialogue Systems
Janghoon Han
Joongbo Shin
Hosung Song
Hyunjik Jo
Gyeonghun Kim
Yireun Kim
Stanley Jungkyu Choi
10
4
0
06 Sep 2022
Review of Natural Language Processing in Pharmacology
Review of Natural Language Processing in Pharmacology
D. Trajanov
Vangel Trajkovski
Makedonka Dimitrieva
Jovana Dobreva
Milos Jovanovik
Matej Klemen
Alevs vZagar
Marko Robnik-vSikonja
LM&MA
21
7
0
22 Aug 2022
Summarizing Patients Problems from Hospital Progress Notes Using
  Pre-trained Sequence-to-Sequence Models
Summarizing Patients Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models
Yanjun Gao
Dmitriy Dligach
T. Miller
Dongfang Xu
M. Churpek
Majid Afshar
AI4MH
22
36
0
17 Aug 2022
Visual Comparison of Language Model Adaptation
Visual Comparison of Language Model Adaptation
R. Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
33
16
0
17 Aug 2022
Abstractive Meeting Summarization: A Survey
Abstractive Meeting Summarization: A Survey
Virgile Rennard
Guokan Shang
Julie Hunter
Michalis Vazirgiannis
32
15
0
08 Aug 2022
On the Limitations of Sociodemographic Adaptation with Transformers
On the Limitations of Sociodemographic Adaptation with Transformers
Chia-Chien Hung
Anne Lauscher
Dirk Hovy
Simone Paolo Ponzetto
Goran Glavavs
19
0
0
01 Aug 2022
Few-shot Adaptation Works with UnpredicTable Data
Few-shot Adaptation Works with UnpredicTable Data
Jun Shern Chan
Michael Pieler
Jonathan Jao
Jérémy Scheurer
Ethan Perez
19
5
0
01 Aug 2022
Masked Autoencoders As The Unified Learners For Pre-Trained Sentence
  Representation
Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation
Alexander H. Liu
Samuel J. Yang
24
5
0
30 Jul 2022
ELF22: A Context-based Counter Trolling Dataset to Combat Internet
  Trolls
ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls
Huije Lee
Young Ju Na
Hoyun Song
Jisu Shin
Jong C. Park
18
7
0
30 Jul 2022
"Do you follow me?": A Survey of Recent Approaches in Dialogue State
  Tracking
"Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking
Léo Jacqmin
L. Rojas-Barahona
Benoit Favre
34
27
0
29 Jul 2022
Innovations in Neural Data-to-text Generation: A Survey
Innovations in Neural Data-to-text Generation: A Survey
Mandar Sharma
Ajay K. Gogineni
Naren Ramakrishnan
24
10
0
25 Jul 2022
PLM-ICD: Automatic ICD Coding with Pretrained Language Models
PLM-ICD: Automatic ICD Coding with Pretrained Language Models
Chao-Wei Huang
Shang-Chi Tsai
Yun-Nung Chen
26
49
0
12 Jul 2022
Domain Confused Contrastive Learning for Unsupervised Domain Adaptation
Domain Confused Contrastive Learning for Unsupervised Domain Adaptation
Quanyu Long
Tianze Luo
Wenya Wang
Sinno Jialin Pan
49
8
0
10 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models:
  Continued Pretraining vs. Semi-Supervised Training
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training
Mitchell DeHaven
J. Billa
VLM
AI4TS
15
8
0
01 Jul 2022
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem
  Understanding
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding
Wayne Xin Zhao
Kun Zhou
Zheng Gong
Beichen Zhang
Yuanhang Zhou
Jing Sha
Zhigang Chen
Shijin Wang
Cong Liu
Ji-Rong Wen
34
18
0
13 Jun 2022
Sort by Structure: Language Model Ranking as Dependency Probing
Sort by Structure: Language Model Ranking as Dependency Probing
Max Müller-Eberstein
Rob van der Goot
Barbara Plank
30
3
0
10 Jun 2022
Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A
  Study on Text Classification for African Languages
Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A Study on Text Classification for African Languages
D. Zhu
Michael A. Hedderich
Fangzhou Zhai
David Ifeoluwa Adelani
Dietrich Klakow
NoLa
32
0
0
03 Jun 2022
Previous
12345678
Next