Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.09115
Cited By
The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT
22 January 2021
Madhura Pande
Aakriti Budhraja
Preksha Nema
Pratyush Kumar
Mitesh M. Khapra
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT"
4 / 4 papers shown
Title
A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification
Dairui Liu
Derek Greene
Ruihai Dong
25
10
0
14 Mar 2022
Towards Building ASR Systems for the Next Billion Users
Tahir Javed
Sumanth Doddapaneni
A. Raman
Kaushal Bhogale
Gowtham Ramesh
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
41
54
0
06 Nov 2021
On the Prunability of Attention Heads in Multilingual BERT
Aakriti Budhraja
Madhura Pande
Pratyush Kumar
Mitesh M. Khapra
42
4
0
26 Sep 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1