ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLM
    AI4CE
    CLL
ArXivPDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 383 papers shown
Title
BadCLM: Backdoor Attack in Clinical Language Models for Electronic
  Health Records
BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records
Weimin Lyu
Zexin Bi
Fusheng Wang
Chao Chen
42
5
0
06 Jul 2024
Using LLMs to label medical papers according to the CIViC evidence model
Using LLMs to label medical papers according to the CIViC evidence model
Markus Hisch
Xing David Wang
31
0
0
05 Jul 2024
CHEW: A Dataset of CHanging Events in Wikipedia
CHEW: A Dataset of CHanging Events in Wikipedia
Hsuvas Borkakoty
Luis Espinosa-Anke
35
1
0
27 Jun 2024
MPCODER: Multi-user Personalized Code Generator with Explicit and
  Implicit Style Representation Learning
MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning
Zhenlong Dai
Chang Yao
WenKang Han
Ying Yuan
Zhipeng Gao
Jingyuan Chen
24
10
0
25 Jun 2024
A Survey on Large Language Models from General Purpose to Medical
  Applications: Datasets, Methodologies, and Evaluations
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MA
AI4MH
ELM
42
4
0
14 Jun 2024
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
Yuchen Zhuang
Haotian Sun
Yue Yu
Rushi Qiang
Qifan Wang
Chao Zhang
Bo Dai
AAML
43
14
0
05 Jun 2024
Entangled Relations: Leveraging NLI and Meta-analysis to Enhance Biomedical Relation Extraction
Entangled Relations: Leveraging NLI and Meta-analysis to Enhance Biomedical Relation Extraction
William Hogan
Jingbo Shang
13
0
0
31 May 2024
ESG-FTSE: A corpus of news articles with ESG relevance labels and use
  cases
ESG-FTSE: A corpus of news articles with ESG relevance labels and use cases
Mariya Pavlova
Bernard Casey
Miaosen Wang
17
0
0
30 May 2024
Aligning to Thousands of Preferences via System Message Generalization
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
34
36
0
28 May 2024
Scaling Laws for Discriminative Classification in Large Language Models
Scaling Laws for Discriminative Classification in Large Language Models
Dean Wyatte
Fatemeh Tahmasbi
Ming Li
Thomas Markovich
35
2
0
24 May 2024
BMRetriever: Tuning Large Language Models as Better Biomedical Text
  Retrievers
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
Ran Xu
Wenqi Shi
Yue Yu
Yuchen Zhuang
Yanqiao Zhu
M. D. Wang
Joyce C. Ho
Chao Zhang
Carl Yang
LM&MA
40
19
0
29 Apr 2024
Effective Unsupervised Constrained Text Generation based on Perturbed
  Masking
Effective Unsupervised Constrained Text Generation based on Perturbed Masking
Yingwen Fu
Wenjie Ou
Zhou Yu
Yue Lin
23
1
0
24 Apr 2024
No Train but Gain: Language Arithmetic for training-free Language
  Adapters enhancement
No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement
Mateusz Klimaszewski
Piotr Andruszkiewicz
Alexandra Birch
MoMe
42
4
0
24 Apr 2024
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
Taojun Hu
Xiao-Hua Zhou
ELM
33
12
0
14 Apr 2024
Comprehensive Study on German Language Models for Clinical and
  Biomedical Text Understanding
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding
Ahmad Idrissi-Yaghir
Amin Dada
Henning Schafer
Kamyar Arzideh
Giulia Baldini
...
Peter A. Horn
Christin Seifert
F. Nensa
Jens Kleesiek
Christoph M. Friedrich
AI4MH
29
2
0
08 Apr 2024
Automating Research Synthesis with Domain-Specific Large Language Model
  Fine-Tuning
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning
Teo Susnjak
Peter Hwang
N. Reyes
A. Barczak
Timothy R. McIntosh
Surangika Ranathunga
68
22
0
08 Apr 2024
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector
Andi Zhang
Tim Z. Xiao
Weiyang Liu
Robert Bamler
Damon J. Wischik
OODD
44
4
0
07 Apr 2024
From Robustness to Improved Generalization and Calibration in
  Pre-trained Language Models
From Robustness to Improved Generalization and Calibration in Pre-trained Language Models
Josip Jukić
Jan Snajder
29
0
0
31 Mar 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
37
62
0
25 Mar 2024
From One to Many: Expanding the Scope of Toxicity Mitigation in Language
  Models
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models
Luiza Amador Pozzobon
Patrick Lewis
Sara Hooker
B. Ermiş
36
7
0
06 Mar 2024
SaulLM-7B: A pioneering Large Language Model for Law
SaulLM-7B: A pioneering Large Language Model for Law
Pierre Colombo
T. Pires
Malik Boudiaf
Dominic Culver
Rui Melo
...
Andre F. T. Martins
Fabrizio Esposito
Vera Lúcia Raposo
Sofia Morgado
Michael Desa
ELM
AILaw
39
63
0
06 Mar 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
B. Ermiş
CLL
KELM
LRM
48
25
0
27 Feb 2024
How Important is Domain Specificity in Language Models and Instruction
  Finetuning for Biomedical Relation Extraction?
How Important is Domain Specificity in Language Models and Instruction Finetuning for Biomedical Relation Extraction?
Aviv Brokman
Ramakanth Kavuluru
LM&MA
ALM
34
3
0
21 Feb 2024
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared
  Semantic Spaces
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces
Tianyu Zheng
Ge Zhang
Xingwei Qu
Ming Kuang
Stephen W. Huang
Zhaofeng He
OffRL
45
1
0
20 Feb 2024
Deep Learning-based Computational Job Market Analysis: A Survey on Skill
  Extraction and Classification from Job Postings
Deep Learning-based Computational Job Market Analysis: A Survey on Skill Extraction and Classification from Job Postings
Elena Senger
Mike Zhang
Rob van der Goot
Barbara Plank
26
7
0
08 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
  Closed-Source LLMs
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
21
155
0
06 Feb 2024
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
Rheeya Uppaal
Yixuan Li
Junjie Hu
34
4
0
31 Jan 2024
Named Entity Recognition Under Domain Shift via Metric Learning for Life
  Sciences
Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences
Hongyi Liu
Qingyun Wang
Payam Karisani
Heng Ji
16
1
0
19 Jan 2024
Some things are more CRINGE than others: Iterative Preference
  Optimization with the Pairwise Cringe Loss
Some things are more CRINGE than others: Iterative Preference Optimization with the Pairwise Cringe Loss
Jing Xu
Andrew Lee
Sainbayar Sukhbaatar
Jason Weston
15
86
0
27 Dec 2023
Balancing the Style-Content Trade-Off in Sentiment Transfer Using
  Polarity-Aware Denoising
Balancing the Style-Content Trade-Off in Sentiment Transfer Using Polarity-Aware Denoising
Sourabrata Mukherjee
Zdeněk Kasner
Ondrej Dusek
DiffM
11
11
0
22 Dec 2023
Time is Encoded in the Weights of Finetuned Language Models
Time is Encoded in the Weights of Finetuned Language Models
Kai Nylund
Suchin Gururangan
Noah A. Smith
AI4TS
23
17
0
20 Dec 2023
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
Yanlong Li
Chamara Madarasingha
Kanchana Thilakarathna
16
1
0
06 Dec 2023
Leveraging Domain Adaptation and Data Augmentation to Improve Quránic
  IR in English and Arabic
Leveraging Domain Adaptation and Data Augmentation to Improve Quránic IR in English and Arabic
Vera Pavlova
21
2
0
05 Dec 2023
LowResource at BLP-2023 Task 2: Leveraging BanglaBert for Low Resource
  Sentiment Analysis of Bangla Language
LowResource at BLP-2023 Task 2: Leveraging BanglaBert for Low Resource Sentiment Analysis of Bangla Language
Aunabil Chakma
Masum Hasan
37
3
0
21 Nov 2023
On the Potential and Limitations of Few-Shot In-Context Learning to
  Generate Metamorphic Specifications for Tax Preparation Software
On the Potential and Limitations of Few-Shot In-Context Learning to Generate Metamorphic Specifications for Tax Preparation Software
Dananjay Srinivas
Rohan Das
Saeid Tizpaz-Niari
Ashutosh Trivedi
Maria Leonor Pacheco
18
4
0
20 Nov 2023
Generative AI for Hate Speech Detection: Evaluation and Findings
Generative AI for Hate Speech Detection: Evaluation and Findings
Sagi Pendzel
Tomer Wullach
Amir Adler
Einat Minkov
25
11
0
16 Nov 2023
Online Continual Knowledge Learning for Language Models
Online Continual Knowledge Learning for Language Models
Yuhao Wu
Tongjun Shi
Karthick Sharma
Chun Seah
Shuhao Zhang
CLL
KELM
23
4
0
16 Nov 2023
Controlled Text Generation for Black-box Language Models via Score-based
  Progressive Editor
Controlled Text Generation for Black-box Language Models via Score-based Progressive Editor
Sangwon Yu
Changmin Lee
Hojin Lee
Sungroh Yoon
22
0
0
13 Nov 2023
AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot
  Classification
AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification
Yongxin Huang
Kexin Wang
Sourav Dutta
Raj Nath Patel
Goran Glavas
Iryna Gurevych
VLM
13
4
0
01 Nov 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained
  Vision-and-Language Models
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Laura Cabello
Emanuele Bugliarello
Stephanie Brandl
Desmond Elliott
23
7
0
26 Oct 2023
CLIFT: Analysing Natural Distribution Shift on Question Answering Models
  in Clinical Domain
CLIFT: Analysing Natural Distribution Shift on Question Answering Models in Clinical Domain
Ankit Pal
19
2
0
19 Oct 2023
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented
  Models
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models
Luiza Amador Pozzobon
B. Ermiş
Patrick Lewis
Sara Hooker
28
20
0
11 Oct 2023
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Anastasios Kyrillidis
Robert Sim
74
6
0
04 Oct 2023
Controllable Text Generation with Residual Memory Transformer
Controllable Text Generation with Residual Memory Transformer
Hanqing Zhang
Sun Si
Haiming Wu
Dawei Song
29
1
0
28 Sep 2023
TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning
TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning
Jing Zhu
Xiang Song
V. Ioannidis
Danai Koutra
Christos Faloutsos
54
13
0
25 Sep 2023
Refashioning Emotion Recognition Modelling: The Advent of Generalised
  Large Models
Refashioning Emotion Recognition Modelling: The Advent of Generalised Large Models
Zixing Zhang
Liyizhe Peng
Tao Pang
Jing Han
Huan Zhao
Bjorn W. Schuller
32
13
0
21 Aug 2023
SPM: Structured Pretraining and Matching Architectures for Relevance
  Modeling in Meituan Search
SPM: Structured Pretraining and Matching Architectures for Relevance Modeling in Meituan Search
Wen-xin Zan
Yaopeng Han
Xiaotian Jiang
Yao Xiao
Yang Yang
Dayao Chen
Sheng Chen
19
3
0
15 Aug 2023
Continual Pre-Training of Large Language Models: How to (re)warm your
  model?
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
24
99
0
08 Aug 2023
DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for
  Domain Adaptation
DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation
Menglong Lu
Zhen Huang
Yunxiang Zhao
Zhiliang Tian
Yang Liu
Dongsheng Li
20
6
0
05 Aug 2023
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking
  In-domain Keywords
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain Keywords
Shahriar Golchin
Mihai Surdeanu
N. Tavabi
A. Kiapour
13
4
0
14 Jul 2023
Previous
12345678
Next