On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

8 June 2020

Papers citing "On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines"

37 / 87 papers shown

Title
Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining? Subhabrata Dutta Jeevesh Juneja Dipankar Das Tanmoy Chakraborty 17 15 0 24 Mar 2022
Revisiting Parameter-Efficient Tuning: Are We Really There Yet? Guanzheng Chen Fangyu Liu Zaiqiao Meng Shangsong Liang 26 88 0 16 Feb 2022
Transformer-based Models of Text Normalization for Speech Applications Jae Hun Ro Felix Stahlberg Ke Wu Shankar Kumar 14 7 0 01 Feb 2022
SciBERTSUM: Extractive Summarization for Scientific Documents Athar Sefid C. Lee Giles 25 9 0 21 Jan 2022
Interpretable Low-Resource Legal Decision Making R. Bhambhoria Hui Liu Samuel Dahan Xiao-Dan Zhu ELM AILaw 19 9 0 01 Jan 2022
QuALITY: Question Answering with Long Input Texts, Yes! Richard Yuanzhe Pang Alicia Parrish Nitish Joshi Nikita Nangia Jason Phang ... Vishakh Padmakumar Johnny Ma Jana Thompson He He Sam Bowman RALM 25 141 0 16 Dec 2021
Pretrained Transformers for Offensive Language Identification in Tanglish Sean Benhur Kanchana Sivanraju VLM 45 5 0 06 Oct 2021
Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training Momchil Hardalov Arnav Arora Preslav Nakov Isabelle Augenstein 22 58 0 13 Sep 2021
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning Prasetya Ajie Utama N. Moosavi Victor Sanh Iryna Gurevych AAML 59 35 0 09 Sep 2021
Open Aspect Target Sentiment Classification with Natural Language Prompts Ronald Seoh Ian Birle Mrinal Tak Haw-Shiuan Chang B. Pinette Alfred Hough 14 61 0 08 Sep 2021
Discrete and Soft Prompting for Multilingual Models Mengjie Zhao Hinrich Schütze LRM 13 71 0 08 Sep 2021
How Does Adversarial Fine-Tuning Benefit BERT? J. Ebrahimi Hao Yang Wei Zhang AAML 23 4 0 31 Aug 2021
Rethinking Why Intermediate-Task Fine-Tuning Works Ting-Yun Chang Chi-Jen Lu LRM 19 29 0 26 Aug 2021
Robust Transfer Learning with Pretrained Language Models through Adapters Wenjuan Han Bo Pang Ying Nian Wu 14 54 0 05 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach Benjamin Ampel Sagar Samtani Steven Ullman Hsinchun Chen 25 35 0 03 Aug 2021
Small-Text: Active Learning for Text Classification in Python Christopher Schröder Lydia Muller A. Niekler Martin Potthast CLIP VLM AI4CE 36 23 0 21 Jul 2021
Noise Stability Regularization for Improving BERT Fine-tuning Hang Hua Xingjian Li Dejing Dou Chengzhong Xu Jiebo Luo 19 43 0 10 Jul 2021
The MultiBERTs: BERT Reproductions for Robustness Analysis Thibault Sellam Steve Yadlowsky Jason W. Wei Naomi Saphra Alexander DÁmour ... Iulia Turc Jacob Eisenstein Dipanjan Das Ian Tenney Ellie Pavlick 24 93 0 30 Jun 2021
A Closer Look at How Fine-tuning Changes BERT Yichu Zhou Vivek Srikumar 26 63 0 27 Jun 2021
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models Robert L Logan IV Ivana Balavzević Eric Wallace Fabio Petroni Sameer Singh Sebastian Riedel VPVLM 36 207 0 24 Jun 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models Elad Ben-Zaken Shauli Ravfogel Yoav Goldberg 32 1,149 0 18 Jun 2021
Reordering Examples Helps during Priming-based Few-Shot Learning Sawan Kumar Partha P. Talukdar 18 56 0 03 Jun 2021
Self-Guided Contrastive Learning for BERT Sentence Representations Taeuk Kim Kang Min Yoo Sang-goo Lee SSL 36 202 0 03 Jun 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality Adithya V Ganesan Matthew Matero Aravind Reddy Ravula Huy-Hien Vu H. A. Schwartz 20 35 0 07 May 2021
ImageNet-21K Pretraining for the Masses T. Ridnik Emanuel Ben-Baruch Asaf Noy Lihi Zelnik-Manor SSeg VLM CLIP 179 686 0 22 Apr 2021
On the Importance of Effectively Adapting Pretrained Language Models for Active Learning Katerina Margatina Loïc Barrault Nikolaos Aletras 19 36 0 16 Apr 2021
How Many Data Points is a Prompt Worth? Teven Le Scao Alexander M. Rush VLM 54 296 0 15 Mar 2021
A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives Nils Rethmeier Isabelle Augenstein SSL VLM 90 90 0 25 Feb 2021
Muppet: Massive Multi-task Representations with Pre-Finetuning Armen Aghajanyan Anchit Gupta Akshat Shrivastava Xilun Chen Luke Zettlemoyer Sonal Gupta 22 266 0 26 Jan 2021
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters Mengjie Zhao Yi Zhu Ehsan Shareghi Ivan Vulić Roi Reichart Anna Korhonen Hinrich Schütze 24 64 0 31 Dec 2020
Efficient Estimation of Influence of a Training Instance Sosuke Kobayashi Sho Yokoi Jun Suzuki Kentaro Inui TDI 27 15 0 08 Dec 2020
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning Beliz Gunel Jingfei Du Alexis Conneau Ves Stoyanov 18 497 0 03 Nov 2020
Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data William Huang Haokun Liu Samuel R. Bowman 16 37 0 09 Oct 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval Lee Xiong Chenyan Xiong Ye Li Kwok-Fung Tang Jialin Liu Paul N. Bennett Junaid Ahmed Arnold Overwijk 11 1,177 0 01 Jul 2020
FreeLB: Enhanced Adversarial Training for Natural Language Understanding Chen Zhu Yu Cheng Zhe Gan S. Sun Tom Goldstein Jingjing Liu AAML 223 437 0 25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models Cheolhyoung Lee Kyunghyun Cho Wanmo Kang MoE 249 205 0 25 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,956 0 20 Apr 2018