Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04341
Cited By
What Does BERT Look At? An Analysis of BERT's Attention
11 June 2019
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"What Does BERT Look At? An Analysis of BERT's Attention"
50 / 883 papers shown
Title
On the Importance of Clinical Notes in Multi-modal Learning for EHR Data
Severin Husmann
Hugo Yèche
Gunnar Rätsch
Rita Kuznetsova
HAI
14
10
0
06 Dec 2022
Syntactic Substitutability as Unsupervised Dependency Syntax
Jasper Jian
Siva Reddy
19
3
0
29 Nov 2022
Explanation on Pretraining Bias of Finetuned Vision Transformer
Bumjin Park
Jaesik Choi
ViT
29
1
0
18 Nov 2022
Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations
Linlin Liu
Xingxuan Li
Megh Thakkar
Xin Li
Shafiq R. Joty
Luo Si
Lidong Bing
27
2
0
16 Nov 2022
Introducing Semantics into Speech Encoders
Derek Xu
Shuyan Dong
Changhan Wang
Suyoun Kim
Zhaojiang Lin
...
Alexei Baevski
Guan-Ting Lin
Hung-yi Lee
Yizhou Sun
Wei Wang
SSL
28
3
0
15 Nov 2022
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLM
MedIm
28
3
0
14 Nov 2022
Finding Skill Neurons in Pre-trained Transformer-based Language Models
Xiaozhi Wang
Kaiyue Wen
Zhengyan Zhang
Lei Hou
Zhiyuan Liu
Juanzi Li
MILM
MoE
21
50
0
14 Nov 2022
Demystify Self-Attention in Vision Transformers from a Semantic Perspective: Analysis and Application
Leijie Wu
Song Guo
Yaohong Ding
Junxiao Wang
Wenchao Xu
Richard Yi Da Xu
Jiewei Zhang
28
2
0
13 Nov 2022
FPT: Improving Prompt Tuning Efficiency via Progressive Training
Yufei Huang
Yujia Qin
Huadong Wang
Yichun Yin
Maosong Sun
Zhiyuan Liu
Qun Liu
VLM
LRM
27
6
0
13 Nov 2022
The Architectural Bottleneck Principle
Tiago Pimentel
Josef Valvoda
Niklas Stoehr
Ryan Cotterell
25
5
0
11 Nov 2022
Improving word mover's distance by leveraging self-attention matrix
Hiroaki Yamagiwa
Sho Yokoi
Hidetoshi Shimodaira
OT
24
4
0
11 Nov 2022
Understanding Cross-modal Interactions in V&L Models that Generate Scene Descriptions
Michele Cafagna
Kees van Deemter
Albert Gatt
CoGe
10
3
0
09 Nov 2022
miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings
T. Klein
Moin Nabi
SSL
20
15
0
09 Nov 2022
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
Michael Hassid
Hao Peng
Daniel Rotem
Jungo Kasai
Ivan Montero
Noah A. Smith
Roy Schwartz
29
24
0
07 Nov 2022
MPCFormer: fast, performant and private Transformer inference with MPC
Dacheng Li
Rulin Shao
Hongyi Wang
Han Guo
Eric P. Xing
Haotong Zhang
13
79
0
02 Nov 2022
Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks
Rochelle Choenni
Dan Garrette
Ekaterina Shutova
24
2
0
31 Oct 2022
Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost
Sungjun Cho
Seonwoo Min
Jinwoo Kim
Moontae Lee
Honglak Lee
Seunghoon Hong
30
3
0
27 Oct 2022
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models
Bowen Shen
Zheng Lin
Yuanxin Liu
Zhengxiao Liu
Lei Wang
Weiping Wang
VLM
33
4
0
27 Oct 2022
Benchmarking Language Models for Code Syntax Understanding
Da Shen
Xinyun Chen
Chenguang Wang
Koushik Sen
Dawn Song
ELM
22
16
0
26 Oct 2022
Influence Functions for Sequence Tagging Models
Sarthak Jain
Varun Manjunatha
Byron C. Wallace
A. Nenkova
TDI
27
8
0
25 Oct 2022
IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models
Chenguang Wang
Xiao Liu
Dawn Song
VLM
16
2
0
25 Oct 2022
Exploring Self-Attention for Crop-type Classification Explainability
Ivica Obadic
R. Roscher
Dario Augusto Borges Oliveira
Xiao Xiang Zhu
22
7
0
24 Oct 2022
A BERT-based Deep Learning Approach for Reputation Analysis in Social Media
Mohammad Wali Ur Rahman
Sicong Shao
Pratik Satam
Salim Hariri
Chris Padilla
Zoe Taylor
C. Nevarez
15
5
0
23 Oct 2022
SLING: Sino Linguistic Evaluation of Large Language Models
Yixiao Song
Kalpesh Krishna
R. Bhatt
Mohit Iyyer
13
8
0
21 Oct 2022
Enhancing Out-of-Distribution Detection in Natural Language Understanding via Implicit Layer Ensemble
Hyunsoo Cho
Choonghyun Park
Jaewoo Kang
Kang Min Yoo
Taeuk Kim
Sang-goo Lee
OODD
22
8
0
20 Oct 2022
Automatic Document Selection for Efficient Encoder Pretraining
Yukun Feng
Patrick Xia
Benjamin Van Durme
João Sedoc
44
7
0
20 Oct 2022
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
34
155
0
19 Oct 2022
Hierarchical Multi-Interest Co-Network For Coarse-Grained Ranking
Xu Yuan
Chengjun Xu
Qiwei Chen
Tao Zhuang
Hongjie Chen
C. Li
Junfeng Ge
AI4TS
22
0
0
19 Oct 2022
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling
Kalpa Gunaratna
Vijay Srinivasan
Akhila Yerukola
Hongxia Jin
21
6
0
19 Oct 2022
A Simple and Effective Method to Improve Zero-Shot Cross-Lingual Transfer Learning
Kunbo Ding
Weijie Liu
Yuejian Fang
Weiquan Mao
Zhe Zhao
Tao Zhu
Haoyan Liu
Rong Tian
Yiren Chen
30
8
0
18 Oct 2022
Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion
Jian Song
Di Liang
Rumei Li
Yun Li
Sirui Wang
Minlong Peng
Wei Yu Wu
Yongxin Yu
27
12
0
16 Oct 2022
RecipeMind: Guiding Ingredient Choices from Food Pairing to Recipe Completion using Cascaded Set Transformer
Mogan Gim
Donghee Choi
Kana Maruyama
Jihun Choi
Hajung Kim
Donghyeon Park
Jaewoo Kang
38
5
0
14 Oct 2022
LSG Attention: Extrapolation of pretrained Transformers to long sequences
Charles Condevaux
S. Harispe
30
24
0
13 Oct 2022
On the Explainability of Natural Language Processing Deep Models
Julia El Zini
M. Awad
25
82
0
13 Oct 2022
AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning
Tao Yang
Jinghao Deng
Xiaojun Quan
Qifan Wang
Shaoliang Nie
28
3
0
12 Oct 2022
Shapley Head Pruning: Identifying and Removing Interference in Multilingual Transformers
William B. Held
Diyi Yang
VLM
32
5
0
11 Oct 2022
Towards Structure-aware Paraphrase Identification with Phrase Alignment Using Sentence Encoders
Qiwei Peng
David J. Weir
Julie Weeds
18
3
0
11 Oct 2022
Characterization of anomalous diffusion through convolutional transformers
Nicolás Firbas
Òscar Garibo i Orts
M. Garcia-March
J. A. Conejero
18
18
0
10 Oct 2022
What the DAAM: Interpreting Stable Diffusion Using Cross Attention
Raphael Tang
Linqing Liu
Akshat Pandey
Zhiying Jiang
Gefei Yang
K. Kumar
Pontus Stenetorp
Jimmy J. Lin
Ferhan Ture
26
167
0
10 Oct 2022
Metaphorical Paraphrase Generation: Feeding Metaphorical Language Models with Literal Texts
Giorgio Ottolina
John Pavlopoulos
18
1
0
10 Oct 2022
Parameter-Efficient Tuning with Special Token Adaptation
Xiaoocong Yang
James Y. Huang
Wenxuan Zhou
Muhao Chen
26
12
0
10 Oct 2022
Better Pre-Training by Reducing Representation Confusion
Haojie Zhang
Mingfei Liang
Ruobing Xie
Zhen Sun
Bo Zhang
Leyu Lin
19
2
0
09 Oct 2022
Breaking BERT: Evaluating and Optimizing Sparsified Attention
Siddhartha Brahma
Polina Zablotskaia
David M. Mimno
25
1
0
07 Oct 2022
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure
Nuo Chen
Qiushi Sun
Renyu Zhu
Xiang Li
Xuesong Lu
Ming Gao
36
10
0
07 Oct 2022
Every word counts: A multilingual analysis of individual human alignment with model attention
Stephanie Brandl
Nora Hollenstein
32
11
0
05 Oct 2022
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing
L. Nie
Jiu Sun
Yanlin Wang
Lun Du
Lei Hou
Juanzi Li
Shi Han
Dongmei Zhang
Jidong Zhai
29
6
0
04 Oct 2022
Causal Proxy Models for Concept-Based Model Explanations
Zhengxuan Wu
Karel DÓosterlinck
Atticus Geiger
Amir Zur
Christopher Potts
MILM
75
35
0
28 Sep 2022
Formal Conceptual Views in Neural Networks
Johannes Hirth
Tom Hanika
13
2
0
27 Sep 2022
Towards Faithful Model Explanation in NLP: A Survey
Qing Lyu
Marianna Apidianaki
Chris Callison-Burch
XAI
109
107
0
22 Sep 2022
SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation
Wanwei He
Yinpei Dai
Min Yang
Jian Sun
Fei Huang
Luo Si
Yongbin Li
17
60
0
14 Sep 2022
Previous
1
2
3
...
7
8
9
...
16
17
18
Next