Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2002.10345
Cited By
Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation
Journal of Computational Science and Technology (JCST), 2020
24 February 2020
Yige Xu
Xipeng Qiu
L. Zhou
Xuanjing Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation"
30 / 30 papers shown
MaxPoolBERT: Enhancing BERT Classification via Layer- and Token-Wise Aggregation
Maike Behrendt
Stefan Sylvius Wagner
Stefan Harmeling
SSeg
625
3
0
21 May 2025
LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing
Neural Information Processing Systems (NeurIPS), 2024
Xiaonan Nie
Qibin Liu
Fangcheng Fu
Shenhan Zhu
Xupeng Miao
Xiaochen Li
Yanzhe Zhang
Shouda Liu
Tengjiao Wang
MoE
252
4
0
13 Nov 2024
Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
Neural Information Processing Systems (NeurIPS), 2024
Yu-Liang Zhan
Zhong-Yi Lu
Hao Sun
Ze-Feng Gao
306
2
0
10 Nov 2024
CleaR: Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Yeachan Kim
Junho Kim
SangKeun Lee
NoLa
AAML
412
5
0
31 Oct 2024
Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Donghoon Kim
Gusang Lee
Kyuhong Shim
B. Shim
349
7
0
29 Oct 2024
SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Shivam Adarsh
Kumar Shridhar
Caglar Gulcehre
Nicholas Monath
Mrinmaya Sachan
LRM
242
6
0
24 Oct 2024
KPC-cF: Aspect-Based Sentiment Analysis via Implicit-Feature Alignment with Corpus Filtering
Kibeom Nam
468
0
0
29 Jun 2024
Large Language Models for Relevance Judgment in Product Search
Navid Mehrdad
Hrushikesh Mohapatra
Mossaab Bagdouri
Prijith Chandran
Alessandro Magnani
...
Ajit Puthenputhussery
Sachin Yadav
Tony Lee
Chengxiang Zhai
Ciya Liao
266
11
0
01 Jun 2024
SurreyAI 2023 Submission for the Quality Estimation Shared Task
Conference on Machine Translation (WMT), 2023
Archchana Sindhujan
Helen Treharne
Constantin Orasan
Tharindu Ranasinghe
236
4
0
01 Dec 2023
Speculative Decoding with Big Little Decoder
Neural Information Processing Systems (NeurIPS), 2023
Sehoon Kim
K. Mangalam
Suhong Moon
Jitendra Malik
Michael W. Mahoney
A. Gholami
Kurt Keutzer
MoE
594
176
0
15 Feb 2023
Knowledge Distillation for Federated Learning: a Practical Guide
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Alessio Mora
Irene Tenison
Paolo Bellavista
Irina Rish
FedML
252
52
0
09 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
316
4
0
01 Nov 2022
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Haw-Shiuan Chang
Ruei-Yao Sun
Kathryn Ricci
Andrew McCallum
400
21
0
10 Oct 2022
SEMI-FND: Stacked Ensemble Based Multimodal Inference For Faster Fake News Detection
Expert systems with applications (ESWA), 2022
Prabhav Singh
Ridam Srivastava
K. Rana
Vineet Kumar
379
49
0
17 May 2022
Unified Implicit Neural Stylization
European Conference on Computer Vision (ECCV), 2022
Zhiwen Fan
Lezhi Li
Peihao Wang
Xinyu Gong
Dejia Xu
Zinan Lin
549
82
0
05 Apr 2022
Unified and Effective Ensemble Knowledge Distillation
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
FedML
180
13
0
01 Apr 2022
Cluster & Tune: Boost Cold Start Performance in Text Classification
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Eyal Shnarch
Ariel Gera
Alon Halfon
Lena Dankin
Leshem Choshen
R. Aharonov
Noam Slonim
250
25
0
20 Mar 2022
BiBERT: Accurate Fully Binarized BERT
International Conference on Learning Representations (ICLR), 2022
Haotong Qin
Yifu Ding
Mingyuan Zhang
Qing Yan
Aishan Liu
Qingqing Dang
Ziwei Liu
Xianglong Liu
MQ
315
120
0
12 Mar 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yoshitomo Matsubara
Luca Soldaini
Eric Lind
Alessandro Moschitti
277
7
0
15 Jan 2022
How Emotionally Stable is ALBERT? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task
Urja Khurana
Eric T. Nalisnick
Antske Fokkens
MoMe
235
6
0
18 Nov 2021
Alternative Input Signals Ease Transfer in Multilingual Machine Translation
Simeng Sun
Angela Fan
James Cross
Vishrav Chaudhary
C. Tran
Philipp Koehn
Francisco Guzman
165
17
0
15 Oct 2021
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models
Qianchu Liu
Fangyu Liu
Nigel Collier
Anna Korhonen
Ivan Vulić
335
24
0
19 Sep 2021
Neighborhood Consensus Contrastive Learning for Backward-Compatible Representation
AAAI Conference on Artificial Intelligence (AAAI), 2021
Shengsen Wu
Liang Chen
Yihang Lou
Yan Bai
Tao Bai
Minghua Deng
Ling-yu Duan
407
8
0
07 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach
Benjamin Ampel
Sagar Samtani
Steven Ullman
Hsinchun Chen
256
52
0
03 Aug 2021
Local-Global Knowledge Distillation in Heterogeneous Federated Learning with Non-IID Data
Dezhong Yao
Wanning Pan
Yutong Dai
Yao Wan
Xiaofeng Ding
Hai Jin
Zheng Xu
Lichao Sun
FedML
603
61
0
30 Jun 2021
An Automated Knowledge Mining and Document Classification System with Multi-model Transfer Learning
J. Chong
Zhiyuan Chen
Mei Shin Oh
91
2
0
24 Jun 2021
AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21
Danqing Zhu
Wangli Lin
Yang Zhang
Qiwei Zhong
Guanxiong Zeng
Weilin Wu
Jiayu Tang
276
19
0
11 Jan 2021
Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach
Yue Yu
Simiao Zuo
Haoming Jiang
Wendi Ren
T. Zhao
Chao Zhang
AI4MH
453
133
0
15 Oct 2020
InfoMiner at WNUT-2020 Task 2: Transformer-based Covid-19 Informative Tweet Extraction
Hansi Hettiarachchi
Tharindu Ranasinghe
MedIm
157
21
0
11 Oct 2020
To BAN or not to BAN: Bayesian Attention Networks for Reliable Hate Speech Detection
Cognitive Computation (Cogn Comput), 2020
Kristian Miok
Blaž Škrlj
D. Zaharie
Marko Robnik-Šikonja
470
46
0
10 Jul 2020
1
Page 1 of 1