Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10964
Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 352 papers shown
Title
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
90
34
0
01 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
18
7
0
31 Dec 2022
Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents
Sayar Ghosh Roy
Anshul Padhi
Risubh Jain
Manish Gupta
Vasudeva Varma
AI4TS
20
2
0
31 Dec 2022
Continual Contrastive Finetuning Improves Low-Resource Relation Extraction
Wenxuan Zhou
Sheng Zhang
Tristan Naumann
Muhao Chen
Hoifung Poon
43
6
0
21 Dec 2022
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su
Weijia Shi
Jungo Kasai
Yizhong Wang
Yushi Hu
Mari Ostendorf
Wen-tau Yih
Noah A. Smith
Luke Zettlemoyer
Tao Yu
25
278
0
19 Dec 2022
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation
Yuxi Feng
Xiaoyuan Yi
Xiting Wang
L. Lakshmanan
Xing Xie
DiffM
27
5
0
16 Dec 2022
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
Zhongwei Wan
Yichun Yin
Wei Zhang
Jiaxin Shi
Lifeng Shang
Guangyong Chen
Xin Jiang
Qun Liu
VLM
CLL
26
16
0
07 Dec 2022
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Hamish Ivison
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
31
19
0
01 Dec 2022
ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format
Qi Zhu
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Baolin Peng
...
Dazhen Wan
Xiaochen Zhu
Jianfeng Gao
Milica Gavsić
Minlie Huang
43
23
0
30 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
27
2
0
27 Nov 2022
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai
Sarvnaz Karimi
8
3
0
24 Nov 2022
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Tanish Lad
Himanshu Maheshwari
Shreyas Kottukkal
R. Mamidi
19
3
0
24 Nov 2022
Continual Learning of Natural Language Processing Tasks: A Survey
Zixuan Ke
Bin Liu
KELM
CLL
VLM
19
68
0
23 Nov 2022
Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps
Hiroki Iida
Naoaki Okazaki
34
4
0
08 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
Chan-Jan Hsu
Ho-Lam Chung
Hung-yi Lee
Yu Tsao
19
6
0
01 Nov 2022
WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain
Raj Sanjay Shah
Kunal Chawla
Dheeraj Eidnani
Agam Shah
Wendi Du
S. Chava
Natraj Raman
Charese Smiley
Jiaao Chen
Diyi Yang
AIFin
24
103
0
31 Oct 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
30
79
0
31 Oct 2022
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
Yue Yu
Chenyan Xiong
Si Sun
Chao Zhang
Arnold Overwijk
VLM
OOD
35
22
0
27 Oct 2022
Learning on Large-scale Text-attributed Graphs via Variational Inference
Jianan Zhao
Meng Qu
Chaozhuo Li
Hao Yan
Qian Liu
Rui Li
Xing Xie
Jian Tang
VLM
17
131
0
26 Oct 2022
Predicting Long-Term Citations from Short-Term Linguistic Influence
Sandeep Soni
David Bamman
Jacob Eisenstein
15
2
0
24 Oct 2022
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation
Phillip Howard
Gadi Singer
Vasudev Lal
Yejin Choi
Swabha Swayamdipta
CML
50
25
0
22 Oct 2022
Table-To-Text generation and pre-training with TabT5
Ewa Andrejczuk
Julian Martin Eisenschlos
Francesco Piccinno
Syrine Krichene
Yasemin Altun
LMTD
15
30
0
17 Oct 2022
Improving generalizability of distilled self-supervised speech processing models under distorted settings
Kuan-Po Huang
Yu-Kuan Fu
Tsung-Yuan Hsu
Fabian Ritter Gutierrez
Fan Wang
Liang-Hsuan Tseng
Yu Zhang
Hung-yi Lee
24
13
0
14 Oct 2022
Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge
Kosuke Nishida
Naoki Yoshinaga
Kyosuke Nishida
22
2
0
14 Oct 2022
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain
Amir Hadifar
Semere Kiros Bitew
Johannes Deleu
Chris Develder
Thomas Demeester
AI4Ed
21
18
0
12 Oct 2022
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks
Charith Peris
Lizhen Tan
Thomas Gueudré
Turan Gojayev
Vivi Wei
Gokmen Oz
22
4
0
10 Oct 2022
Leveraging Key Information Modeling to Improve Less-Data Constrained News Headline Generation via Duality Fine-Tuning
Zhuoxuan Jiang
Lingfeng Qiao
Di Yin
Shanshan Feng
Bo Ren
SyDa
28
2
0
10 Oct 2022
On Task-Adaptive Pretraining for Dialogue Response Selection
Tzu-Hsiang Lin
Ta-Chung Chi
Anna Rumshisky
11
1
0
08 Oct 2022
Short Text Pre-training with Extended Token Classification for E-commerce Query Understanding
Haoming Jiang
Tianyu Cao
Zheng Li
Cheng-hsin Luo
Xianfeng Tang
Qingyu Yin
Danqing Zhang
R. Goutam
Bing Yin
RALM
16
11
0
08 Oct 2022
Calibrating Factual Knowledge in Pretrained Language Models
Qingxiu Dong
Damai Dai
Yifan Song
Jingjing Xu
Zhifang Sui
Lei Li
KELM
228
82
0
07 Oct 2022
SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis
Jiaxin Pei
Vítor Silva
Maarten W. Bos
Yozon Liu
Leonardo Neves
David Jurgens
Francesco Barbieri
53
28
0
03 Oct 2022
DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing
Yanjun Gao
Dmitriy Dligach
Timothy A. Miller
John R. Caskey
Brihat Sharma
M. Churpek
Majid Afshar
ELM
LRM
32
16
0
29 Sep 2022
Generalizing through Forgetting -- Domain Generalization for Symptom Event Extraction in Clinical Notes
Sitong Zhou
K. Lybarger
Meliha Yetisgen-Yildiz
Mari Ostendorf
32
2
0
20 Sep 2022
External Knowledge Selection with Weighted Negative Sampling in Knowledge-grounded Task-oriented Dialogue Systems
Janghoon Han
Joongbo Shin
Hosung Song
Hyunjik Jo
Gyeonghun Kim
Yireun Kim
Stanley Jungkyu Choi
10
4
0
06 Sep 2022
Review of Natural Language Processing in Pharmacology
D. Trajanov
Vangel Trajkovski
Makedonka Dimitrieva
Jovana Dobreva
Milos Jovanovik
Matej Klemen
Alevs vZagar
Marko Robnik-vSikonja
LM&MA
21
7
0
22 Aug 2022
Summarizing Patients Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models
Yanjun Gao
Dmitriy Dligach
T. Miller
Dongfang Xu
M. Churpek
Majid Afshar
AI4MH
22
36
0
17 Aug 2022
Visual Comparison of Language Model Adaptation
R. Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
33
16
0
17 Aug 2022
Abstractive Meeting Summarization: A Survey
Virgile Rennard
Guokan Shang
Julie Hunter
Michalis Vazirgiannis
32
15
0
08 Aug 2022
On the Limitations of Sociodemographic Adaptation with Transformers
Chia-Chien Hung
Anne Lauscher
Dirk Hovy
Simone Paolo Ponzetto
Goran Glavavs
19
0
0
01 Aug 2022
Few-shot Adaptation Works with UnpredicTable Data
Jun Shern Chan
Michael Pieler
Jonathan Jao
Jérémy Scheurer
Ethan Perez
19
5
0
01 Aug 2022
Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation
Alexander H. Liu
Samuel J. Yang
24
5
0
30 Jul 2022
ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls
Huije Lee
Young Ju Na
Hoyun Song
Jisu Shin
Jong C. Park
18
7
0
30 Jul 2022
"Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking
Léo Jacqmin
L. Rojas-Barahona
Benoit Favre
34
27
0
29 Jul 2022
Innovations in Neural Data-to-text Generation: A Survey
Mandar Sharma
Ajay K. Gogineni
Naren Ramakrishnan
24
10
0
25 Jul 2022
PLM-ICD: Automatic ICD Coding with Pretrained Language Models
Chao-Wei Huang
Shang-Chi Tsai
Yun-Nung Chen
26
49
0
12 Jul 2022
Domain Confused Contrastive Learning for Unsupervised Domain Adaptation
Quanyu Long
Tianze Luo
Wenya Wang
Sinno Jialin Pan
49
8
0
10 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training
Mitchell DeHaven
J. Billa
VLM
AI4TS
15
8
0
01 Jul 2022
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding
Wayne Xin Zhao
Kun Zhou
Zheng Gong
Beichen Zhang
Yuanhang Zhou
Jing Sha
Zhigang Chen
Shijin Wang
Cong Liu
Ji-Rong Wen
34
18
0
13 Jun 2022
Sort by Structure: Language Model Ranking as Dependency Probing
Max Müller-Eberstein
Rob van der Goot
Barbara Plank
30
3
0
10 Jun 2022
Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A Study on Text Classification for African Languages
D. Zhu
Michael A. Hedderich
Fangzhou Zhai
David Ifeoluwa Adelani
Dietrich Klakow
NoLa
32
0
0
03 Jun 2022
Previous
1
2
3
4
5
6
7
8
Next