ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.00943
  4. Cited By
How Far Does BERT Look At:Distance-based Clustering and Analysis of
  BERT$'$s Attention

How Far Does BERT Look At:Distance-based Clustering and Analysis of BERT′'′s Attention

2 November 2020
Yue Guan
Jingwen Leng
Chao Li
Quan Chen
M. Guo
ArXivPDFHTML

Papers citing "How Far Does BERT Look At:Distance-based Clustering and Analysis of BERT$'$s Attention"

4 / 4 papers shown
Title
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural
  Network Quantization
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
Cong Guo
Chen Zhang
Jingwen Leng
Zihan Liu
Fan Yang
Yun-Bo Liu
Minyi Guo
Yuhao Zhu
MQ
20
55
0
30 Aug 2022
Dual-side Sparse Tensor Core
Dual-side Sparse Tensor Core
Yang-Feng Wang
Chen Zhang
Zhiqiang Xie
Cong Guo
Yunxin Liu
Jingwen Leng
20
74
0
20 May 2021
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
252
580
0
12 Mar 2020
What you can cram into a single vector: Probing sentence embeddings for
  linguistic properties
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
1