Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04341
Cited By
What Does BERT Look At? An Analysis of BERT's Attention
11 June 2019
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"What Does BERT Look At? An Analysis of BERT's Attention"
50 / 885 papers shown
Title
Diagnosing BERT with Retrieval Heuristics
A. Câmara
C. Hauff
12
33
0
12 Jan 2022
An Opinion Mining of Text in COVID-19 Issues along with Comparative Study in ML, BERT & RNN
Md Mahadi Hasan Sany
Mumenunnesa Keya
S. Khushbu
AKM SHAHARIAR AZAD RABBY
Abu Kaisar Mohammad Masum
17
2
0
06 Jan 2022
Does Entity Abstraction Help Generative Transformers Reason?
Nicolas Angelard-Gontier
Siva Reddy
C. Pal
31
5
0
05 Jan 2022
Event-based clinical findings extraction from radiology reports with pre-trained language model
Wilson Lau
K. Lybarger
Martin Gunn
Meliha Yetisgen
15
5
0
27 Dec 2021
Block-Skim: Efficient Question Answering for Transformer
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
Yuhao Zhu
27
30
0
16 Dec 2021
Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation
Raymond Li
Wen Xiao
Linzi Xing
Lanjun Wang
Gabriel Murray
Giuseppe Carenini
ViT
25
7
0
10 Dec 2021
Explainable Deep Learning in Healthcare: A Methodological Survey from an Attribution View
Di Jin
Elena Sergeeva
W. Weng
Geeticka Chauhan
Peter Szolovits
OOD
31
55
0
05 Dec 2021
Inducing Causal Structure for Interpretable Neural Networks
Atticus Geiger
Zhengxuan Wu
Hanson Lu
J. Rozner
Elisa Kreiss
Thomas F. Icard
Noah D. Goodman
Christopher Potts
CML
OOD
29
70
0
01 Dec 2021
Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition
Anmol Nayak
Hariprasad Timmapathini
OOD
10
3
0
01 Dec 2021
What to Learn, and How: Toward Effective Learning from Rationales
Samuel Carton
Surya Kanoria
Chenhao Tan
35
22
0
30 Nov 2021
Exploring Low-Cost Transformer Model Compression for Large-Scale Commercial Reply Suggestions
Vaishnavi Shrivastava
Radhika Gaonkar
Shashank Gupta
Abhishek Jha
6
0
0
27 Nov 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViT
VGen
18
292
0
24 Nov 2021
Does BERT look at sentiment lexicon?
E. Razova
S. Vychegzhanin
Evgeny Kotelnikov
19
2
0
19 Nov 2021
LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model
Yukyung Lee
Jina Kim
Pilsung Kang
9
78
0
18 Nov 2021
Interpreting Language Models Through Knowledge Graph Extraction
Vinitra Swamy
Angelika Romanou
Martin Jaggi
26
20
0
16 Nov 2021
Triggerless Backdoor Attack for NLP Tasks with Clean Labels
Leilei Gan
Jiwei Li
Tianwei Zhang
Xiaoya Li
Yuxian Meng
Fei Wu
Yi Yang
Shangwei Guo
Chun Fan
AAML
SILM
27
74
0
15 Nov 2021
Rationale production to support clinical decision-making
Niall Taylor
Lei Sha
Dan W Joyce
Thomas Lukasiewicz
A. Nevado-Holgado
Andrey Kormilitzin
FAtt
14
4
0
15 Nov 2021
Towards Interpretability of Speech Pause in Dementia Detection using Adversarial Learning
Youxiang Zhu
Bang Tran
Xiaohui Liang
J. Batsis
R. Roth
AAML
13
6
0
14 Nov 2021
Counterfactual Explanations for Models of Code
Jürgen Cito
Işıl Dillig
V. Murali
S. Chandra
AAML
LRM
29
47
0
10 Nov 2021
How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing
Valentin Barrière
Guillaume Jacquet
14
1
0
07 Nov 2021
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MA
VLM
AI4CE
74
1,030
0
01 Nov 2021
Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework
Lifan Yuan
Yichi Zhang
Yangyi Chen
Wei Wei
AAML
19
32
0
28 Oct 2021
Team Enigma at ArgMining-EMNLP 2021: Leveraging Pre-trained Language Models for Key Point Matching
Chao Fan
Yang Yang
Siba Smarak Panigrahi
Varun Madhavan
Abhilash Nandy
14
9
0
24 Oct 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
17
44
0
20 Oct 2021
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
27
115
0
19 Oct 2021
Schrödinger's Tree -- On Syntax and Neural Language Models
Artur Kulmizev
Joakim Nivre
30
6
0
17 Oct 2021
Improving Transformers with Probabilistic Attention Keys
Tam Nguyen
T. Nguyen
Dung D. Le
Duy Khuong Nguyen
Viet-Anh Tran
Richard G. Baraniuk
Nhat Ho
Stanley J. Osher
47
32
0
16 Oct 2021
Breaking Down Multilingual Machine Translation
Ting-Rui Chiang
Yi-Pei Chen
Yi-Ting Yeh
Graham Neubig
11
13
0
15 Oct 2021
Modeling Endorsement for Multi-Document Abstractive Summarization
Logan Lebanoff
Bingqing Wang
Z. Feng
Fei Liu
123
4
0
15 Oct 2021
Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models
Tianlu Wang
Rohit Sridhar
Diyi Yang
Xuezhi Wang
AAML
120
72
0
14 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
24
59
0
14 Oct 2021
Leveraging redundancy in attention with Reuse Transformers
Srinadh Bhojanapalli
Ayan Chakrabarti
Andreas Veit
Michal Lukasik
Himanshu Jain
Frederick Liu
Yin-Wen Chang
Sanjiv Kumar
18
23
0
13 Oct 2021
A Comprehensive Comparison of Word Embeddings in Event & Entity Coreference Resolution
Judicael Poumay
A. Ittoo
6
2
0
11 Oct 2021
Paperswithtopic: Topic Identification from Paper Title Only
Daehyun Cho
C. Wallraven
21
0
0
09 Oct 2021
Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Kyuhong Shim
Iksoo Choi
Wonyong Sung
Jungwook Choi
32
15
0
07 Oct 2021
How BPE Affects Memorization in Transformers
Eugene Kharitonov
Marco Baroni
Dieuwke Hupkes
163
32
0
06 Oct 2021
A Survey of Knowledge Enhanced Pre-trained Models
Jian Yang
Xinyu Hu
Gang Xiao
Yulong Shen
KELM
30
5
0
01 Oct 2021
BERT4GCN: Using BERT Intermediate Layers to Augment GCN for Aspect-based Sentiment Classification
Zeguan Xiao
Jiarun Wu
Qingliang Chen
Congjian Deng
19
71
0
01 Oct 2021
Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations
Ekaterina Taktasheva
Vladislav Mikhailov
Ekaterina Artemova
19
13
0
28 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
12
133
0
27 Sep 2021
On the Prunability of Attention Heads in Multilingual BERT
Aakriti Budhraja
Madhura Pande
Pratyush Kumar
Mitesh M. Khapra
42
4
0
26 Sep 2021
RuleBert: Teaching Soft Rules to Pre-trained Language Models
Mohammed Saeed
N. Ahmadi
Preslav Nakov
Paolo Papotti
LRM
247
31
0
24 Sep 2021
FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning
Xu Wang
Hainan Zhang
Shuai Zhao
Yanyan Zou
Hongshen Chen
Zhuoye Ding
Bo Cheng
Yanyan Lan
AAML
13
7
0
22 Sep 2021
What BERT Based Language Models Learn in Spoken Transcripts: An Empirical Study
Ayush Kumar
Mukuntha Narayanan Sundararaman
Jithendra Vepa
17
10
0
19 Sep 2021
Distilling Linguistic Context for Language Model Compression
Geondo Park
Gyeongman Kim
Eunho Yang
45
38
0
17 Sep 2021
What Vision-Language Models `See' when they See Scenes
Michele Cafagna
Kees van Deemter
Albert Gatt
VLM
29
13
0
15 Sep 2021
A Relation-Oriented Clustering Method for Open Relation Extraction
Jun Zhao
Tao Gui
Qi Zhang
Yaqian Zhou
39
33
0
15 Sep 2021
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
160
46
0
15 Sep 2021
Can Edge Probing Tasks Reveal Linguistic Knowledge in QA Models?
Sagnik Ray Choudhury
Nikita Bhutani
Isabelle Augenstein
19
1
0
15 Sep 2021
The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders
Han He
Jinho D. Choi
43
87
0
14 Sep 2021
Previous
1
2
3
...
10
11
12
...
16
17
18
Next