ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.09115
  4. Cited By
The heads hypothesis: A unifying statistical approach towards
  understanding multi-headed attention in BERT

The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT

22 January 2021
Madhura Pande
Aakriti Budhraja
Preksha Nema
Pratyush Kumar
Mitesh M. Khapra
ArXivPDFHTML

Papers citing "The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT"

4 / 4 papers shown
Title
A Novel Perspective to Look At Attention: Bi-level Attention-based
  Explainable Topic Modeling for News Classification
A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification
Dairui Liu
Derek Greene
Ruihai Dong
25
10
0
14 Mar 2022
Towards Building ASR Systems for the Next Billion Users
Towards Building ASR Systems for the Next Billion Users
Tahir Javed
Sumanth Doddapaneni
A. Raman
Kaushal Bhogale
Gowtham Ramesh
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
41
54
0
06 Nov 2021
On the Prunability of Attention Heads in Multilingual BERT
On the Prunability of Attention Heads in Multilingual BERT
Aakriti Budhraja
Madhura Pande
Pratyush Kumar
Mitesh M. Khapra
42
4
0
26 Sep 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1