Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04341
Cited By
What Does BERT Look At? An Analysis of BERT's Attention
11 June 2019
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"What Does BERT Look At? An Analysis of BERT's Attention"
50 / 885 papers shown
Title
RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization
Austin P. Wright
Omar Shaikh
Haekyu Park
Will Epperson
Muhammed Ahmed
Stephane Pinel
Duen Horng Chau
Diyi Yang
17
21
0
08 Feb 2021
CLiMP: A Benchmark for Chinese Language Model Evaluation
Beilei Xiang
Changbing Yang
Yu Li
Alex Warstadt
Katharina Kann
ALM
17
38
0
26 Jan 2021
"Laughing at you or with you": The Role of Sarcasm in Shaping the Disagreement Space
Debanjan Ghosh
Ritvik Shrivastava
Smaranda Muresan
21
11
0
26 Jan 2021
Attention Can Reflect Syntactic Structure (If You Let It)
Vinit Ravishankar
Artur Kulmizev
Mostafa Abdou
Anders Søgaard
Joakim Nivre
21
32
0
26 Jan 2021
Coloring the Black Box: What Synesthesia Tells Us about Character Embeddings
Katharina Kann
Mauro M. Monsalve-Mercado
12
2
0
26 Jan 2021
The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT
Madhura Pande
Aakriti Budhraja
Preksha Nema
Pratyush Kumar
Mitesh M. Khapra
31
19
0
22 Jan 2021
Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?
Andrés García-Silva
José Manuél Gómez-Pérez
17
11
0
20 Jan 2021
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing Jin
Cai Liang
Tiancheng Wu
Li Zou
Zhiliang Gan
MQ
17
26
0
15 Jan 2021
Interpretable Multi-Head Self-Attention model for Sarcasm Detection in social media
Ramya Akula
Ivan I. Garibay
25
2
0
14 Jan 2021
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
33
16
0
12 Jan 2021
Automating the Compilation of Potential Core-Outcomes for Clinical Trials
Shwetha Bharadwaj
M. Laffin
17
0
0
11 Jan 2021
CDLM: Cross-Document Language Modeling
Avi Caciularu
Arman Cohan
Iz Beltagy
Matthew E. Peters
Arie Cattan
Ido Dagan
VLM
27
32
0
02 Jan 2021
On Explaining Your Explanations of BERT: An Empirical Study with Sequence Classification
Zhengxuan Wu
Desmond C. Ong
22
20
0
01 Jan 2021
Coreference Reasoning in Machine Reading Comprehension
Mingzhu Wu
N. Moosavi
Dan Roth
Iryna Gurevych
LRM
15
8
0
31 Dec 2020
Neural Machine Translation: A Review of Methods, Resources, and Tools
Zhixing Tan
Shuo Wang
Zonghan Yang
Gang Chen
Xuancheng Huang
Maosong Sun
Yang Liu
3DV
AI4TS
17
105
0
31 Dec 2020
Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings
Jacob Turton
D. Vinson
Robert Smith
11
24
0
30 Dec 2020
Improving BERT with Syntax-aware Local Attention
Zhongli Li
Qingyu Zhou
Chao Li
Ke Xu
Yunbo Cao
61
44
0
30 Dec 2020
Transformer Feed-Forward Layers Are Key-Value Memories
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
36
745
0
29 Dec 2020
SG-Net: Syntax Guided Transformer for Language Representation
Zhuosheng Zhang
Yuwei Wu
Junru Zhou
Sufeng Duan
Hai Zhao
Rui-cang Wang
46
36
0
27 Dec 2020
Inserting Information Bottlenecks for Attribution in Transformers
Zhiying Jiang
Raphael Tang
Ji Xin
Jimmy J. Lin
35
6
0
27 Dec 2020
Gender Bias in Multilingual Neural Machine Translation: The Architecture Matters
Marta R. Costa-jussá
Carlos Escolano
Christine Basta
Javier Ferrando
Roser Batlle-Roca
Ksenia Kharitonova
14
18
0
24 Dec 2020
Disentangling semantics in language through VAEs and a certain architectural choice
G. Felhi
Joseph Le Roux
Djamé Seddah
CoGe
DRL
12
1
0
24 Dec 2020
Multi-Head Self-Attention with Role-Guided Masks
Dongsheng Wang
Casper Hansen
Lucas Chaves Lima
Christian B. Hansen
Maria Maistro
J. Simonsen
Christina Lioma
23
1
0
22 Dec 2020
Undivided Attention: Are Intermediate Layers Necessary for BERT?
S. N. Sridhar
Anthony Sarah
22
14
0
22 Dec 2020
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Armen Aghajanyan
Luke Zettlemoyer
Sonal Gupta
8
528
1
22 Dec 2020
Encoding Syntactic Knowledge in Transformer Encoder for Intent Detection and Slot Filling
Jixuan Wang
Kai Wei
Martin H. Radfar
Weiwei Zhang
Clement Chung
21
35
0
21 Dec 2020
Domain specific BERT representation for Named Entity Recognition of lab protocol
Tejas Vaidhya
Ayush Kaushal
17
9
0
21 Dec 2020
Explaining Black-box Models for Biomedical Text Classification
M. Moradi
Matthias Samwald
39
21
0
20 Dec 2020
Learning from Mistakes: Using Mis-predictions as Harm Alerts in Language Pre-Training
Chen Xing
Wenhao Liu
Caiming Xiong
23
0
0
16 Dec 2020
Mask-Align: Self-Supervised Neural Word Alignment
Chi Chen
Maosong Sun
Yang Liu
11
33
0
13 Dec 2020
Infusing Finetuning with Semantic Dependencies
Zhaofeng Wu
Hao Peng
Noah A. Smith
22
36
0
10 Dec 2020
Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks
Modestas Filipavicius
Matteo Manica
Joris Cadow
María Rodríguez Martínez
15
13
0
05 Dec 2020
Self-Explaining Structures Improve NLP Models
Zijun Sun
Chun Fan
Qinghong Han
Xiaofei Sun
Yuxian Meng
Fei Wu
Jiwei Li
MILM
XAI
LRM
FAtt
39
38
0
03 Dec 2020
Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Embeddings and the Implications to Representation Learning
Wei Zhang
Murray Campbell
Yang Yu
Sadhana Kumaravel
18
0
0
03 Dec 2020
Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories
Jakob Prange
Nathan Schneider
Vivek Srikumar
10
23
0
02 Dec 2020
An Investigation of Language Model Interpretability via Sentence Editing
Samuel Stevens
Yu-Chuan Su
LRM
13
8
0
28 Nov 2020
Picking BERT's Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis
Michael A. Lepori
R. Thomas McCoy
22
23
0
24 Nov 2020
Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural Networks
Ileana Rugina
Rumen Dangovski
L. Jing
Preslav Nakov
Marin Soljacic
26
0
0
20 Nov 2020
On the Dynamics of Training Attention Models
Haoye Lu
Yongyi Mao
A. Nayak
11
7
0
19 Nov 2020
E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks
Nikolaos Stylianou
I. Vlahavas
14
3
0
10 Nov 2020
Natural Language Inference in Context -- Investigating Contextual Reasoning over Long Texts
Hanmeng Liu
Leyang Cui
Jian Liu
Yue Zhang
ReLM
LRM
25
42
0
10 Nov 2020
Language Through a Prism: A Spectral Approach for Multiscale Language Representations
Alex Tamkin
Dan Jurafsky
Noah D. Goodman
26
42
0
09 Nov 2020
Positional Artefacts Propagate Through Masked Language Model Embeddings
Ziyang Luo
Artur Kulmizev
Xiaoxi Mao
29
41
0
09 Nov 2020
Understanding Pure Character-Based Neural Machine Translation: The Case of Translating Finnish into English
Gongbo Tang
Rico Sennrich
Joakim Nivre
25
7
0
06 Nov 2020
How Far Does BERT Look At:Distance-based Clustering and Analysis of BERT
′
'
′
s Attention
Yue Guan
Jingwen Leng
Chao Li
Quan Chen
M. Guo
14
19
0
02 Nov 2020
Influence Patterns for Explaining Information Flow in BERT
Kaiji Lu
Zifan Wang
Piotr (Peter) Mardziel
Anupam Datta
GNN
22
16
0
02 Nov 2020
Understanding Pre-trained BERT for Aspect-based Sentiment Analysis
Hu Xu
Lei Shu
Philip S. Yu
Bing-Quan Liu
SSL
18
44
0
31 Oct 2020
Contextual BERT: Conditioning the Language Model Using a Global State
Timo I. Denk
Ana Peleteiro Ramallo
12
6
0
29 Oct 2020
Fine-grained Information Status Classification Using Discourse Context-Aware BERT
Yufang Hou
9
6
0
26 Oct 2020
A Weakly-Supervised Semantic Segmentation Approach based on the Centroid Loss: Application to Quality Control and Inspection
Kai Yao
A. Ortiz
F. Bonnín-Pascual
27
0
0
26 Oct 2020
Previous
1
2
3
...
13
14
15
16
17
18
Next