What Does BERT Look At? An Analysis of BERT's Attention

11 June 2019

Kevin Clark

Urvashi Khandelwal

Omer Levy

Christopher D. Manning

MILM

ArXiv PDF HTML

Papers citing "What Does BERT Look At? An Analysis of BERT's Attention"

33 / 883 papers shown

Title
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance R. Thomas McCoy Junghyun Min Tal Linzen 16 147 0 07 Nov 2019
How Can BERT Help Lexical Semantics Tasks? Yile Wang Leyang Cui Yue Zhang SSeg 11 11 0 07 Nov 2019
ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations Shizhe Diao Jiaxin Bai Yan Song Tong Zhang Yonggang Wang AI4CE 17 134 0 02 Nov 2019
Inducing brain-relevant bias in natural language processing models Dan Schwartz Mariya Toneva Leila Wehbe 8 78 0 29 Oct 2019
What does BERT Learn from Multiple-Choice Reading Comprehension Datasets? Chenglei Si Shuohang Wang Min-Yen Kan Jing Jiang 29 53 0 28 Oct 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks M. Moradshahi Hamid Palangi M. Lam P. Smolensky Jianfeng Gao 21 16 0 25 Oct 2019
Fine-grained Fact Verification with Kernel Graph Attention Network Zhenghao Liu Chenyan Xiong Maosong Sun Zhiyuan Liu 32 219 0 22 Oct 2019
A Neural Entity Coreference Resolution Review Nikolaos Stylianou I. Vlahavas 8 38 0 21 Oct 2019
Whatcha lookin' at? DeepLIFTing BERT's Attention in Question Answering Ekaterina Arkhangelskaia Sourav Dutta AIMat 9 9 0 14 Oct 2019
exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models Benjamin Hoover Hendrik Strobelt Sebastian Gehrmann 19 86 0 11 Oct 2019
Is Multilingual BERT Fluent in Language Generation? Samuel Rönnqvist Jenna Kanerva T. Salakoski Filip Ginter 11 71 0 09 Oct 2019
Knowledge Distillation from Internal Representations Gustavo Aguilar Yuan Ling Yu Zhang Benjamin Yao Xing Fan Edward Guo 17 177 0 08 Oct 2019
Analyzing Sentence Fusion in Abstractive Summarization Logan Lebanoff John Muchovej Franck Dernoncourt Doo Soon Kim Seokhwan Kim W. Chang Fei Liu 13 42 0 01 Oct 2019
Interrogating the Explanatory Power of Attention in Neural Machine Translation Pooya Moradi Nishant Kambhatla Anoop Sarkar 13 16 0 30 Sep 2019
Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture Ashok Thillaisundaram Theodosia Togia 8 17 0 26 Sep 2019
Multi-Dimensional Explanation of Target Variables from Documents Diego Antognini C. Musat Boi Faltings 16 2 0 25 Sep 2019
Attention Interpretability Across NLP Tasks Shikhar Vashishth Shyam Upadhyay Gaurav Singh Tomar Manaal Faruqui XAI MILM 23 176 0 24 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding Xiaoqi Jiao Yichun Yin Lifeng Shang Xin Jiang Xiao Chen Linlin Li F. Wang Qun Liu VLM 11 1,813 0 23 Sep 2019
Language models and Automated Essay Scoring Pedro Uría Rodríguez Amir Jafari C. Ormerod 22 82 0 18 Sep 2019
SANVis: Visual Analytics for Understanding Self-Attention Networks Cheonbok Park Inyoup Na Yongjang Jo Sungbok Shin J. Yoo Bum Chul Kwon Jian Zhao Hyungjong Noh Yeonsoo Lee Jaegul Choo HAI 27 38 0 13 Sep 2019
Semantics-aware BERT for Language Understanding Zhuosheng Zhang Yuwei Wu Zhao Hai Z. Li Shuailiang Zhang Xi Zhou Xiang Zhou 19 363 0 05 Sep 2019
Rotate King to get Queen: Word Relationships as Orthogonal Transformations in Embedding Space Kawin Ethayarajh LLMSV 13 13 0 02 Sep 2019
Does BERT agree? Evaluating knowledge of structure dependence through agreement relations Geoff Bacon T. Regier 11 21 0 26 Aug 2019
SG-Net: Syntax-Guided Machine Reading Comprehension Zhuosheng Zhang Yuwei Wu Junru Zhou Sufeng Duan Hai Zhao Rui Wang 25 187 0 14 Aug 2019
Neural Machine Translation with Noisy Lexical Constraints Huayang Li Guoping Huang Deng Cai Lemao Liu 12 12 0 13 Aug 2019
On Identifiability in Transformers Gino Brunner Yang Liu Damian Pascual Oliver Richter Massimiliano Ciaramita Roger Wattenhofer ViT 19 186 0 12 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language Liunian Harold Li Mark Yatskar Da Yin Cho-Jui Hsieh Kai-Wei Chang VLM 35 1,912 0 09 Aug 2019
What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models Allyson Ettinger 8 594 0 31 Jul 2019
Theoretical Limitations of Self-Attention in Neural Sequence Models Michael Hahn 11 259 0 16 Jun 2019
An Attentive Survey of Attention Models S. Chaudhari Varun Mithal Gungor Polatkan R. Ramanath 19 638 0 05 Apr 2019
Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning Weipéng Huáng Xingyi Cheng Kunlong Chen Taifeng Wang Wei Chu 6 60 0 11 Mar 2019
Attention in Natural Language Processing Andrea Galassi Marco Lippi Paolo Torroni GNN 15 467 0 04 Feb 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 199 882 0 03 May 2018