ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.06305
  4. Cited By
Fine-Tuning Pretrained Language Models: Weight Initializations, Data
  Orders, and Early Stopping

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

15 February 2020
Jesse Dodge
Gabriel Ilharco
Roy Schwartz
Ali Farhadi
Hannaneh Hajishirzi
Noah A. Smith
ArXivPDFHTML

Papers citing "Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping"

50 / 94 papers shown
Title
GRASP: Municipal Budget AI Chatbots for Enhancing Civic Engagement
GRASP: Municipal Budget AI Chatbots for Enhancing Civic Engagement
Jerry Xu
Justin Wang
Joley Leung
Jasmine Gu
50
0
0
30 Mar 2025
Distributional Scaling Laws for Emergent Capabilities
Distributional Scaling Laws for Emergent Capabilities
Rosie Zhao
Tian Qin
David Alvarez-Melis
Sham Kakade
Naomi Saphra
LRM
37
0
0
24 Feb 2025
Fine-Tuning Games: Bargaining and Adaptation for General-Purpose Models
Fine-Tuning Games: Bargaining and Adaptation for General-Purpose Models
Benjamin Laufer
Jon M. Kleinberg
Hoda Heidari
55
8
0
03 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
12
0
31 Dec 2024
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
Yaming Zhang
Chenqiang Gao
Fangcen Liu
Junjie Guo
Lan Wang
Xinggan Peng
Deyu Meng
92
0
0
21 Dec 2024
Unified Parameter-Efficient Unlearning for LLMs
Chenlu Ding
Jiancan Wu
Yancheng Yuan
Jinda Lu
Kai Zhang
Alex Su
Xiang Wang
Xiangnan He
MU
KELM
100
6
0
30 Nov 2024
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale
  Models
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
Qiaoyu Tang
Le Yu
Bowen Yu
Hongyu Lin
K. Lu
Y. Lu
Xianpei Han
Le Sun
MoMe
32
1
0
17 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
37
3
0
12 Oct 2024
An Empirical Investigation of Matrix Factorization Methods for
  Pre-trained Transformers
An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers
Ashim Gupta
Sina Mahdipour Saravani
P. Sadayappan
Vivek Srikumar
24
2
0
17 Jun 2024
Uncertainty modeling for fine-tuned implicit functions
Uncertainty modeling for fine-tuned implicit functions
A. Susmelj
Mael Macuglia
Nataša Tagasovska
Reto Sutter
Sebastiano Caprara
Jean-Philippe Thiran
E. Konukoglu
62
1
0
17 Jun 2024
LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of
  Childhood Health Outcomes Using Pre-Trained Language Models
LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models
Dasun Athukoralage
Thushari Atapattu
M. Thilakaratne
Katrina Falkner
LM&MA
13
0
0
11 Jun 2024
Large Language Models for Cyber Security: A Systematic Literature Review
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
K. Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Janet Liu
H. Wang
29
23
0
08 May 2024
From Large Language Models and Optimization to Decision Optimization
  CoPilot: A Research Manifesto
From Large Language Models and Optimization to Decision Optimization CoPilot: A Research Manifesto
S. Wasserkrug
Léonard Boussioux
D. Hertog
F. Mirzazadeh
Ilker Birbil
Jannis Kurtz
Donato Maragno
LLMAG
30
3
0
26 Feb 2024
PL-FSCIL: Harnessing the Power of Prompts for Few-Shot Class-Incremental
  Learning
PL-FSCIL: Harnessing the Power of Prompts for Few-Shot Class-Incremental Learning
Songsong Tian
Lusi Li
Weijun Li
Hang Ran
Li Li
X. Ning
CLL
VLM
34
3
0
26 Jan 2024
Canvil: Designerly Adaptation for LLM-Powered User Experiences
Canvil: Designerly Adaptation for LLM-Powered User Experiences
K. J. Kevin Feng
Q. V. Liao
Ziang Xiao
Jennifer Wortman Vaughan
Amy X. Zhang
David W. McDonald
31
16
0
17 Jan 2024
Making LLMs Worth Every Penny: Resource-Limited Text Classification in
  Banking
Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking
Lefteris Loukas
Ilias Stogiannidis
Odysseas Diamantopoulos
Prodromos Malakasiotis
Stavros Vassos
10
43
0
10 Nov 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained
  Vision-and-Language Models
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Laura Cabello
Emanuele Bugliarello
Stephanie Brandl
Desmond Elliott
23
7
0
26 Oct 2023
Holy Grail 2.0: From Natural Language to Constraint Models
Holy Grail 2.0: From Natural Language to Constraint Models
Dimosthenis C. Tsouros
Hélène Verhaeghe
Serdar Kadiouglu
Tias Guns
24
11
0
03 Aug 2023
Investigating the Learning Behaviour of In-context Learning: A
  Comparison with Supervised Learning
Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning
Xindi Wang
Yufei Wang
Can Xu
Xiubo Geng
Bowen Zhang
Chongyang Tao
Frank Rudzicz
Robert E. Mercer
Daxin Jiang
14
11
0
28 Jul 2023
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and
  Time Efficient Adapter Tuning for Dense Predictions
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions
Dongshuo Yin
Xueting Han
Bin Li
Hao Feng
Jinghua Bai
VPVLM
26
16
0
16 Jun 2023
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and
  Zero-Shot Fact Verification with Pre-trained Language Models
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and Zero-Shot Fact Verification with Pre-trained Language Models
Fengzhu Zeng
Wei Gao
17
5
0
05 Jun 2023
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
Joongwon Kim
Akari Asai
Gabriel Ilharco
Hannaneh Hajishirzi
17
11
0
22 May 2023
Measuring and Mitigating Local Instability in Deep Neural Networks
Measuring and Mitigating Local Instability in Deep Neural Networks
Arghya Datta
Subhrangshu Nandi
Jingcheng Xu
Greg Ver Steeg
He Xie
Anoop Kumar
Aram Galstyan
15
3
0
18 May 2023
Exploring Data Augmentation Methods on Social Media Corpora
Exploring Data Augmentation Methods on Social Media Corpora
Isabel Garcia Pietri
Kineret Stanley
28
0
0
03 Mar 2023
Measuring the Instability of Fine-Tuning
Measuring the Instability of Fine-Tuning
Yupei Du
D. Nguyen
18
4
0
15 Feb 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
Text classification in shipping industry using unsupervised models and
  Transformer based supervised models
Text classification in shipping industry using unsupervised models and Transformer based supervised models
Yingyi Xie
Dongping Song
27
1
0
21 Dec 2022
A Natural Bias for Language Generation Models
A Natural Bias for Language Generation Models
Clara Meister
Wojciech Stokowiec
Tiago Pimentel
Lei Yu
Laura Rimell
A. Kuncoro
MILM
25
6
0
19 Dec 2022
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
39
422
0
08 Dec 2022
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Kishaloy Halder
Josip Krapac
A. Akbik
Anthony Brew
Matti Lyra
30
0
0
30 Nov 2022
We need to talk about random seeds
We need to talk about random seeds
Steven Bethard
23
8
0
24 Oct 2022
Training Dynamics for Curriculum Learning: A Study on Monolingual and
  Cross-lingual NLU
Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU
Fenia Christopoulou
Gerasimos Lampouras
Ignacio Iacobacci
24
3
0
22 Oct 2022
lo-fi: distributed fine-tuning without communication
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
19
24
0
19 Oct 2022
Improving Stability of Fine-Tuning Pretrained Language Models via
  Component-Wise Gradient Norm Clipping
Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping
Chenghao Yang
Xuezhe Ma
24
6
0
19 Oct 2022
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Haw-Shiuan Chang
Ruei-Yao Sun
Kathryn Ricci
Andrew McCallum
37
14
0
10 Oct 2022
Efficient Few-Shot Learning Without Prompts
Efficient Few-Shot Learning Without Prompts
Lewis Tunstall
Nils Reimers
Unso Eun Seo Jo
Luke Bates
Daniel Korat
Moshe Wasserblat
Oren Pereg
VLM
26
181
0
22 Sep 2022
Deep Reinforcement Learning for Cryptocurrency Trading: Practical
  Approach to Address Backtest Overfitting
Deep Reinforcement Learning for Cryptocurrency Trading: Practical Approach to Address Backtest Overfitting
Berend Gort
Xiao-Yang Liu
Xinghang Sun
Jiechao Gao
Shuai Chen
Chris Wang
13
12
0
12 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
Combating high variance in Data-Scarce Implicit Hate Speech
  Classification
Combating high variance in Data-Scarce Implicit Hate Speech Classification
Debaditya Pal
Kaustubh Chaudhari
Harsh Sharma
19
1
0
29 Aug 2022
Mere Contrastive Learning for Cross-Domain Sentiment Analysis
Mere Contrastive Learning for Cross-Domain Sentiment Analysis
Yun Luo
Fang Guo
Zihan Liu
Yue Zhang
26
15
0
18 Aug 2022
Eco2AI: carbon emissions tracking of machine learning models as the
  first step towards sustainable AI
Eco2AI: carbon emissions tracking of machine learning models as the first step towards sustainable AI
S. Budennyy
V. Lazarev
N. Zakharenko
A. Korovin
Olga Plosskaya
...
Ivan V. Oseledets
I. Barsola
Ilya M. Egorov
A. Kosterina
L. Zhukov
24
89
0
31 Jul 2022
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Shijie Wu
Benjamin Van Durme
Mark Dredze
17
6
0
12 Jul 2022
Explanation-based Counterfactual Retraining(XCR): A Calibration Method
  for Black-box Models
Explanation-based Counterfactual Retraining(XCR): A Calibration Method for Black-box Models
Liu Zhendong
Wenyu Jiang
Yan Zhang
Chongjun Wang
CML
6
0
0
22 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for
  Large-Scale Transformers
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLM
MQ
39
438
0
04 Jun 2022
Can Foundation Models Help Us Achieve Perfect Secrecy?
Can Foundation Models Help Us Achieve Perfect Secrecy?
Simran Arora
Christopher Ré
FedML
11
6
0
27 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures
  of Soft Prompts
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
120
100
0
24 May 2022
Few-Shot Natural Language Inference Generation with PDD: Prompt and
  Dynamic Demonstration
Few-Shot Natural Language Inference Generation with PDD: Prompt and Dynamic Demonstration
Kaijian Li
Shansan Gong
Kenny Q. Zhu
19
0
0
21 May 2022
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for
  Pre-Trained Encoder Transfer Learning
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning
Orion Weller
Kevin Seppi
Matt Gardner
8
21
0
17 May 2022
How to Fine-tune Models with Few Samples: Update, Data Augmentation, and
  Test-time Augmentation
How to Fine-tune Models with Few Samples: Update, Data Augmentation, and Test-time Augmentation
Yujin Kim
Jaehoon Oh
Sungnyun Kim
Se-Young Yun
26
6
0
13 May 2022
A Comparison of Approaches for Imbalanced Classification Problems in the
  Context of Retrieving Relevant Documents for an Analysis
A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis
Sandra Wankmüller
23
2
0
03 May 2022
12
Next