ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04884
  4. Cited By
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
  Strong Baselines

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

8 June 2020
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
ArXivPDFHTML

Papers citing "On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines"

37 / 87 papers shown
Title
Can Unsupervised Knowledge Transfer from Social Discussions Help
  Argument Mining?
Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?
Subhabrata Dutta
Jeevesh Juneja
Dipankar Das
Tanmoy Chakraborty
17
15
0
24 Mar 2022
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Guanzheng Chen
Fangyu Liu
Zaiqiao Meng
Shangsong Liang
26
88
0
16 Feb 2022
Transformer-based Models of Text Normalization for Speech Applications
Transformer-based Models of Text Normalization for Speech Applications
Jae Hun Ro
Felix Stahlberg
Ke Wu
Shankar Kumar
14
7
0
01 Feb 2022
SciBERTSUM: Extractive Summarization for Scientific Documents
SciBERTSUM: Extractive Summarization for Scientific Documents
Athar Sefid
C. Lee Giles
25
9
0
21 Jan 2022
Interpretable Low-Resource Legal Decision Making
Interpretable Low-Resource Legal Decision Making
R. Bhambhoria
Hui Liu
Samuel Dahan
Xiao-Dan Zhu
ELM
AILaw
19
9
0
01 Jan 2022
QuALITY: Question Answering with Long Input Texts, Yes!
QuALITY: Question Answering with Long Input Texts, Yes!
Richard Yuanzhe Pang
Alicia Parrish
Nitish Joshi
Nikita Nangia
Jason Phang
...
Vishakh Padmakumar
Johnny Ma
Jana Thompson
He He
Sam Bowman
RALM
25
141
0
16 Dec 2021
Pretrained Transformers for Offensive Language Identification in
  Tanglish
Pretrained Transformers for Offensive Language Identification in Tanglish
Sean Benhur
Kanchana Sivanraju
VLM
45
5
0
06 Oct 2021
Few-Shot Cross-Lingual Stance Detection with Sentiment-Based
  Pre-Training
Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training
Momchil Hardalov
Arnav Arora
Preslav Nakov
Isabelle Augenstein
22
58
0
13 Sep 2021
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning
Prasetya Ajie Utama
N. Moosavi
Victor Sanh
Iryna Gurevych
AAML
59
35
0
09 Sep 2021
Open Aspect Target Sentiment Classification with Natural Language
  Prompts
Open Aspect Target Sentiment Classification with Natural Language Prompts
Ronald Seoh
Ian Birle
Mrinal Tak
Haw-Shiuan Chang
B. Pinette
Alfred Hough
14
61
0
08 Sep 2021
Discrete and Soft Prompting for Multilingual Models
Discrete and Soft Prompting for Multilingual Models
Mengjie Zhao
Hinrich Schütze
LRM
13
71
0
08 Sep 2021
How Does Adversarial Fine-Tuning Benefit BERT?
How Does Adversarial Fine-Tuning Benefit BERT?
J. Ebrahimi
Hao Yang
Wei Zhang
AAML
23
4
0
31 Aug 2021
Rethinking Why Intermediate-Task Fine-Tuning Works
Rethinking Why Intermediate-Task Fine-Tuning Works
Ting-Yun Chang
Chi-Jen Lu
LRM
19
29
0
26 Aug 2021
Robust Transfer Learning with Pretrained Language Models through
  Adapters
Robust Transfer Learning with Pretrained Language Models through Adapters
Wenjuan Han
Bo Pang
Ying Nian Wu
14
54
0
05 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK
  Framework: A Self-Distillation Approach
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach
Benjamin Ampel
Sagar Samtani
Steven Ullman
Hsinchun Chen
25
35
0
03 Aug 2021
Small-Text: Active Learning for Text Classification in Python
Small-Text: Active Learning for Text Classification in Python
Christopher Schröder
Lydia Muller
A. Niekler
Martin Potthast
CLIP
VLM
AI4CE
36
23
0
21 Jul 2021
Noise Stability Regularization for Improving BERT Fine-tuning
Noise Stability Regularization for Improving BERT Fine-tuning
Hang Hua
Xingjian Li
Dejing Dou
Chengzhong Xu
Jiebo Luo
19
43
0
10 Jul 2021
The MultiBERTs: BERT Reproductions for Robustness Analysis
The MultiBERTs: BERT Reproductions for Robustness Analysis
Thibault Sellam
Steve Yadlowsky
Jason W. Wei
Naomi Saphra
Alexander DÁmour
...
Iulia Turc
Jacob Eisenstein
Dipanjan Das
Ian Tenney
Ellie Pavlick
24
93
0
30 Jun 2021
A Closer Look at How Fine-tuning Changes BERT
A Closer Look at How Fine-tuning Changes BERT
Yichu Zhou
Vivek Srikumar
26
63
0
27 Jun 2021
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
  Language Models
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
Robert L Logan IV
Ivana Balavzević
Eric Wallace
Fabio Petroni
Sameer Singh
Sebastian Riedel
VPVLM
36
207
0
24 Jun 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based
  Masked Language-models
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
32
1,149
0
18 Jun 2021
Reordering Examples Helps during Priming-based Few-Shot Learning
Reordering Examples Helps during Priming-based Few-Shot Learning
Sawan Kumar
Partha P. Talukdar
18
56
0
03 Jun 2021
Self-Guided Contrastive Learning for BERT Sentence Representations
Self-Guided Contrastive Learning for BERT Sentence Representations
Taeuk Kim
Kang Min Yoo
Sang-goo Lee
SSL
36
202
0
03 Jun 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
  The Role of Sample Size and Dimensionality
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya V Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. A. Schwartz
20
35
0
07 May 2021
ImageNet-21K Pretraining for the Masses
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
179
686
0
22 Apr 2021
On the Importance of Effectively Adapting Pretrained Language Models for
  Active Learning
On the Importance of Effectively Adapting Pretrained Language Models for Active Learning
Katerina Margatina
Loïc Barrault
Nikolaos Aletras
19
36
0
16 Apr 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
54
296
0
15 Mar 2021
A Primer on Contrastive Pretraining in Language Processing: Methods,
  Lessons Learned and Perspectives
A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives
Nils Rethmeier
Isabelle Augenstein
SSL
VLM
90
90
0
25 Feb 2021
Muppet: Massive Multi-task Representations with Pre-Finetuning
Muppet: Massive Multi-task Representations with Pre-Finetuning
Armen Aghajanyan
Anchit Gupta
Akshat Shrivastava
Xilun Chen
Luke Zettlemoyer
Sonal Gupta
22
266
0
26 Jan 2021
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots
  Matters
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters
Mengjie Zhao
Yi Zhu
Ehsan Shareghi
Ivan Vulić
Roi Reichart
Anna Korhonen
Hinrich Schütze
24
64
0
31 Dec 2020
Efficient Estimation of Influence of a Training Instance
Efficient Estimation of Influence of a Training Instance
Sosuke Kobayashi
Sho Yokoi
Jun Suzuki
Kentaro Inui
TDI
27
15
0
08 Dec 2020
Supervised Contrastive Learning for Pre-trained Language Model
  Fine-tuning
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning
Beliz Gunel
Jingfei Du
Alexis Conneau
Ves Stoyanov
18
497
0
03 Nov 2020
Counterfactually-Augmented SNLI Training Data Does Not Yield Better
  Generalization Than Unaugmented Data
Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data
William Huang
Haokun Liu
Samuel R. Bowman
16
37
0
09 Oct 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense
  Text Retrieval
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong
Chenyan Xiong
Ye Li
Kwok-Fung Tang
Jialin Liu
Paul N. Bennett
Junaid Ahmed
Arnold Overwijk
11
1,177
0
01 Jul 2020
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
223
437
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained
  Language Models
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
249
205
0
25 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
Previous
12