Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1905.09418
Cited By
v1
v2 (latest)
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
23 May 2019
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned"
50 / 743 papers shown
Acceptability Judgements via Examining the Topology of Attention Maps
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
D. Cherniavskii
Eduard Tulchinskii
Vladislav Mikhailov
Irina Proskurina
Laida Kushnareva
Ekaterina Artemova
S. Barannikov
Irina Piontkovskaya
D. Piontkovski
Evgeny Burnaev
972
25
0
19 May 2022
Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Gerard Sant
Gerard I. Gállego
Belen Alastruey
Marta R. Costa-jussá
202
4
0
14 May 2022
A Study of the Attention Abnormality in Trojaned BERTs
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Weimin Lyu
Songzhu Zheng
Teng Ma
Chao Chen
374
67
0
13 May 2022
EigenNoise: A Contrastive Prior to Warm-Start Representations
H. Heidenreich
Jake Williams
135
1
0
09 May 2022
Knowledge Distillation of Russian Language Models with Reduction of Vocabulary
Computational Linguistics and Intellectual Technologies (CLIT), 2022
A. Kolesnikova
Yuri Kuratov
Vasily Konovalov
Andrey Kravchenko
VLM
127
12
0
04 May 2022
Adaptable Adapters
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
N. Moosavi
Quentin Delfosse
Kristian Kersting
Iryna Gurevych
235
20
0
03 May 2022
Visualizing and Explaining Language Models
Adrian M. P. Braşoveanu
Razvan Andonie
MILM
VLM
336
7
0
30 Apr 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
285
308
0
27 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
264
133
0
25 Apr 2022
Merging of neural networks
Neural Processing Letters (NPL), 2022
Martin Pasen
Vladimír Boza
FedML
MoMe
214
3
0
21 Apr 2022
Regularization-based Pruning of Irrelevant Weights in Deep Neural Architectures
Giovanni Bonetta
Matteo Ribero
R. Cancelliere
193
9
0
11 Apr 2022
Paying More Attention to Self-attention: Improving Pre-trained Language Models via Attention Guiding
Shanshan Wang
Zhumin Chen
Zhaochun Ren
Huasheng Liang
Qiang Yan
Sudipta Singha Roy
130
10
0
06 Apr 2022
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Yanyang Li
Fuli Luo
Runxin Xu
Songfang Huang
Fei Huang
Liwei Wang
175
3
0
06 Apr 2022
CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Nishant Kambhatla
Logan Born
Anoop Sarkar
192
18
0
01 Apr 2022
Structured Pruning Learns Compact and Accurate Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
390
221
0
01 Apr 2022
TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Ziqing Yang
Yiming Cui
Zhigang Chen
SyDa
VLM
159
15
0
30 Mar 2022
Fine-Grained Visual Entailment
European Conference on Computer Vision (ECCV), 2022
Christopher Thomas
Yipeng Zhang
Shih-Fu Chang
336
7
0
29 Mar 2022
A Fast Post-Training Pruning Framework for Transformers
Neural Information Processing Systems (NeurIPS), 2022
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
251
208
0
29 Mar 2022
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Xin Huang
A. Khetan
Rene Bidart
Zohar Karnin
193
21
0
27 Mar 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
228
135
0
24 Mar 2022
Input-specific Attention Subnetworks for Adversarial Detection
Findings (Findings), 2022
Emil Biju
Anirudh Sriram
Pratyush Kumar
Mitesh M Khapra
AAML
171
5
0
23 Mar 2022
Training-free Transformer Architecture Search
Computer Vision and Pattern Recognition (CVPR), 2022
Qinqin Zhou
Kekai Sheng
Xiawu Zheng
Ke Li
Xing Sun
Yonghong Tian
Jie Chen
Rongrong Ji
ViT
195
57
0
23 Mar 2022
Task-guided Disentangled Tuning for Pretrained Language Models
Findings (Findings), 2022
Jiali Zeng
Yu Jiang
Shuangzhi Wu
Yongjing Yin
Mu Li
DRL
367
3
0
22 Mar 2022
Word Order Does Matter (And Shuffled Language Models Know It)
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Vinit Ravishankar
Mostafa Abdou
Artur Kulmizev
Anders Søgaard
209
51
0
21 Mar 2022
Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Zuzana Jelčicová
Marian Verhelst
308
7
0
20 Mar 2022
Gaussian Multi-head Attention for Simultaneous Machine Translation
Findings (Findings), 2022
Shaolei Zhang
Yang Feng
160
26
0
17 Mar 2022
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Eldar Kurtic
Daniel Fernando Campos
Tuan Nguyen
Elias Frantar
Mark Kurtz
Ben Fineran
Michael Goin
Dan Alistarh
VLM
MQ
MedIm
400
148
0
14 Mar 2022
A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification
Findings (Findings), 2022
Dairui Liu
Derek Greene
Ruihai Dong
320
14
0
14 Mar 2022
Visualizing and Understanding Patch Interactions in Vision Transformer
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Jie Ma
Yalong Bai
Bineng Zhong
Wei Zhang
Ting Yao
Tao Mei
ViT
183
54
0
11 Mar 2022
Data-Efficient Structured Pruning via Submodular Optimization
Neural Information Processing Systems (NeurIPS), 2022
Marwa El Halabi
Suraj Srinivas
Damien Scieur
428
23
0
09 Mar 2022
Understanding microbiome dynamics via interpretable graph representation learning
Scientific Reports (Sci Rep), 2022
K. Melnyk
Kuba Weimann
Tim Conrad
238
7
0
02 Mar 2022
XAI for Transformers: Better Explanations through Conservative Propagation
International Conference on Machine Learning (ICML), 2022
Ameen Ali
Thomas Schnake
Oliver Eberle
G. Montavon
Klaus-Robert Muller
Lior Wolf
FAtt
346
133
0
15 Feb 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models
AAAI Conference on Artificial Intelligence (AAAI), 2022
Canwen Xu
Julian McAuley
365
89
0
15 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
Neurocomputing (Neurocomputing), 2022
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
230
23
0
11 Feb 2022
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models
International Conference on Learning Representations (ICLR), 2022
Chen Liang
Haoming Jiang
Simiao Zuo
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
T. Zhao
216
17
0
06 Feb 2022
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
210
5
0
29 Jan 2022
Rethinking Attention-Model Explainability through Faithfulness Violation Test
International Conference on Machine Learning (ICML), 2022
Zichen Liu
Haoliang Li
Yangyang Guo
Chen Kong
Jing Li
Shiqi Wang
FAtt
351
57
0
28 Jan 2022
Can Model Compression Improve NLP Fairness
Guangxuan Xu
Qingyuan Hu
170
30
0
21 Jan 2022
Latency Adjustable Transformer Encoder for Language Understanding
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Sajjad Kachuee
M. Sharifkhani
603
1
0
10 Jan 2022
Intelligent Online Selling Point Extraction for E-Commerce Recommendation
Xiaojie Guo
Shugen Wang
Hanqing Zhao
Shiliang Diao
Jiajia Chen
...
Zhen He
Yun Xiao
Bo Long
Han Yu
Lingfei Wu
149
18
0
16 Dec 2021
Sparse Interventions in Language Models with Differentiable Masking
Nicola De Cao
Leon Schmid
Dieuwke Hupkes
Ivan Titov
250
33
0
13 Dec 2021
On the Compression of Natural Language Models
S. Damadi
119
0
0
13 Dec 2021
Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation
Raymond Li
Wen Xiao
Linzi Xing
Lanjun Wang
Gabriel Murray
Giuseppe Carenini
ViT
322
10
0
10 Dec 2021
Explainable Deep Learning in Healthcare: A Methodological Survey from an Attribution View
WIREs Mechanisms of Disease (WIREs Mech Dis), 2021
Di Jin
Elena Sergeeva
W. Weng
Geeticka Chauhan
Peter Szolovits
OOD
299
77
0
05 Dec 2021
Can depth-adaptive BERT perform better on binary classification tasks
Jing Fan
Xin Zhang
Sheng Zhang
Yan Pan
Lixiang Guo
MQ
190
0
0
22 Nov 2021
Does BERT look at sentiment lexicon?
International Joint Conference on the Analysis of Images, Social Networks and Texts (AISNT), 2021
E. Razova
S. Vychegzhanin
Evgeny Kotelnikov
182
3
0
19 Nov 2021
Local Multi-Head Channel Self-Attention for Facial Expression Recognition
Roberto Pecoraro
Valerio Basile
Viviana Bono
Sara Gallo
ViT
331
63
0
14 Nov 2021
A Survey on Green Deep Learning
Jingjing Xu
Wangchunshu Zhou
Zhiyi Fu
Hao Zhou
Lei Li
VLM
470
98
0
08 Nov 2021
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework
Xingcheng Yao
Yanan Zheng
Xiaocong Yang
Zhilin Yang
293
50
0
07 Nov 2021
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nyström Method
Neural Information Processing Systems (NeurIPS), 2021
Yifan Chen
Qi Zeng
Heng Ji
Yun Yang
261
65
0
29 Oct 2021
Previous
1
2
3
...
9
10
11
...
13
14
15
Next
Page 10 of 15
Page
of 15
Go