Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04884
Cited By
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
8 June 2020
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines"
50 / 82 papers shown
Title
Decoding Reading Goals from Eye Movements
Omer Shubi
Cfir Avraham Hadar
Yevgeni Berzak
AIMat
44
1
0
28 Oct 2024
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
Adrian Chan
Anupam Mijar
Mehreen Saeed
Chau-Wai Wong
Akram Khater
36
0
0
03 Oct 2024
Efficient LLM Context Distillation
Rajesh Upadhayayaya
Zachary Smith
Chritopher Kottmyer
Manish Raj Osti
39
1
0
03 Sep 2024
Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models
Christopher Schröder
Gerhard Heyer
VLM
44
0
0
13 Jun 2024
Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study
Myrthe Reuver
Suzan Verberne
Antske Fokkens
34
1
0
05 Apr 2024
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion
Zijun Long
George Killick
Lipeng Zhuang
Gerardo Aragon Camarasa
Zaiqiao Meng
R. McCreadie
VLM
39
2
0
22 Feb 2024
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
Yupei Du
Albert Gatt
Dong Nguyen
24
1
0
10 Oct 2023
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and Zero-Shot Fact Verification with Pre-trained Language Models
Fengzhu Zeng
Wei Gao
17
5
0
05 Jun 2023
Understanding Emotion Valence is a Joint Deep Learning Task
Gabriel Roccabruna
Seyed Mahed Mousavi
Giuseppe Riccardi
21
0
0
27 May 2023
Toward Connecting Speech Acts and Search Actions in Conversational Search Tasks
Souvick Ghosh
Satanu Ghosh
C. Shah
25
2
0
08 May 2023
KINLP at SemEval-2023 Task 12: Kinyarwanda Tweet Sentiment Analysis
Antoine Nzeyimana
9
3
0
25 Apr 2023
On the Variance of Neural Network Training with respect to Test Sets and Distributions
Keller Jordan
OOD
16
10
0
04 Apr 2023
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks
Antonis Maronikolakis
Abdullatif Köksal
Hinrich Schütze
40
0
0
04 Apr 2023
Measuring the Instability of Fine-Tuning
Yupei Du
D. Nguyen
18
4
0
15 Feb 2023
Evaluating the Robustness of Discrete Prompts
Yoichi Ishibashi
Danushka Bollegala
Katsuhito Sudoh
Satoshi Nakamura
21
18
0
11 Feb 2023
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Asha Vishwanathan
R. Warrier
G. V. Suresh
Chandrashekhar Kandpal
11
2
0
25 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
Leonid Boytsov
Preksha Patel
Vivek Sourabh
Riddhi Nisar
Sayan Kundu
R. Ramanathan
Eric Nyberg
21
19
0
08 Jan 2023
Examining Political Rhetoric with Epistemic Stance Detection
Ankita Gupta
Su Lin Blodgett
Justin H. Gross
Brendan T. O'Connor
20
0
0
29 Dec 2022
KL Regularized Normalization Framework for Low Resource Tasks
Neeraj Kumar
Ankur Narang
Brejesh Lall
21
1
0
21 Dec 2022
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Kishaloy Halder
Josip Krapac
A. Akbik
Anthony Brew
Matti Lyra
30
0
0
30 Nov 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
27
11
0
30 Nov 2022
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai
Sarvnaz Karimi
13
3
0
24 Nov 2022
Probing neural language models for understanding of words of estimative probability
Damien Sileo
Marie-Francine Moens
19
10
0
07 Nov 2022
Gradient Knowledge Distillation for Pre-trained Language Models
Lean Wang
Lei Li
Xu Sun
VLM
23
5
0
02 Nov 2022
We need to talk about random seeds
Steven Bethard
31
8
0
24 Oct 2022
Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping
Chenghao Yang
Xuezhe Ma
32
6
0
19 Oct 2022
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning
Shuo Xie
Jiahao Qiu
Ankita Pasad
Li Du
Qing Qu
Hongyuan Mei
32
16
0
18 Oct 2022
AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning
Tao Yang
Jinghao Deng
Xiaojun Quan
Qifan Wang
Shaoliang Nie
28
3
0
12 Oct 2022
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Haw-Shiuan Chang
Ruei-Yao Sun
Kathryn Ricci
Andrew McCallum
43
14
0
10 Oct 2022
UU-Tax at SemEval-2022 Task 3: Improving the generalizability of language models for taxonomy classification through data augmentation
I. Sarhan
P. Mosteiro
Marco Spruit
29
2
0
07 Oct 2022
An Empirical Study on Cross-X Transfer for Legal Judgment Prediction
Joel Niklaus
Matthias Sturmer
Ilias Chalkidis
ELM
AILaw
32
18
0
25 Sep 2022
Drawing Causal Inferences About Performance Effects in NLP
Sandra Wankmüller
CML
16
1
0
14 Sep 2022
Heuristic-free Optimization of Force-Controlled Robot Search Strategies in Stochastic Environments
Bastian Alt
Darko Katic
Rainer Jäkel
Michael Beetz
18
6
0
15 Jul 2022
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Shijie Wu
Benjamin Van Durme
Mark Dredze
25
6
0
12 Jul 2022
Pretrained Models for Multilingual Federated Learning
Orion Weller
Marc Marone
Vladimir Braverman
Dawn J Lawrie
Benjamin Van Durme
VLM
FedML
AI4CE
33
42
0
06 Jun 2022
Can Foundation Models Help Us Achieve Perfect Secrecy?
Simran Arora
Christopher Ré
FedML
16
6
0
27 May 2022
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
234
45
0
24 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
124
100
0
24 May 2022
Calibration of Natural Language Understanding Models with Venn--ABERS Predictors
Patrizio Giovannotti
38
6
0
21 May 2022
Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction
Manikandan Ravikiran
Bharathi Raja Chakravarthi
20
3
0
12 May 2022
Few-shot Mining of Naturally Occurring Inputs and Outputs
Mandar Joshi
Terra Blevins
M. Lewis
Daniel S. Weld
Luke Zettlemoyer
25
1
0
09 May 2022
A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis
Sandra Wankmüller
23
2
0
03 May 2022
UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles for Detecting Patronizing and Condescending Language
David Koleczek
Alexander Scarlatos
Siddha Makarand Karkare
Preshma Linet Pereira
19
0
0
18 Apr 2022
Reducing Model Jitter: Stable Re-training of Semantic Parsers in Production Environments
Christopher Hidey
Fei Liu
Rahul Goel
21
4
0
10 Apr 2022
CoCoSoDa: Effective Contrastive Learning for Code Search
Ensheng Shi
Yanlin Wang
Wenchao Gu
Lun Du
Hongyu Zhang
Shi Han
Dongmei Zhang
Hongbin Sun
28
33
0
07 Apr 2022
PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
Rabeeh Karimi Mahabadi
Luke Zettlemoyer
James Henderson
Marzieh Saeidi
Lambert Mathias
Ves Stoyanov
Majid Yazdani
VLM
31
69
0
03 Apr 2022
Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?
Subhabrata Dutta
Jeevesh Juneja
Dipankar Das
Tanmoy Chakraborty
9
15
0
24 Mar 2022
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Guanzheng Chen
Fangyu Liu
Zaiqiao Meng
Shangsong Liang
26
88
0
16 Feb 2022
Transformer-based Models of Text Normalization for Speech Applications
Jae Hun Ro
Felix Stahlberg
Ke Wu
Shankar Kumar
14
7
0
01 Feb 2022
1
2
Next