Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.05714
Cited By
A Multiscale Visualization of Attention in the Transformer Model
12 June 2019
Jesse Vig
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Multiscale Visualization of Attention in the Transformer Model"
50 / 82 papers shown
Title
What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models
Jan Kapar
Niklas Koenen
Martin Jullum
64
0
0
29 Apr 2025
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang
Yifei Liu
Yingdong Shi
Chong Li
Anqi Pang
Sibei Yang
Jingyi Yu
Kan Ren
ViT
69
0
0
12 Mar 2025
Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention
Zhenyu Guo
Wenguang Chen
46
0
0
01 Jan 2025
On the Role of Attention Heads in Large Language Model Safety
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Fan Zhang
Yongbin Li
59
5
0
17 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
40
10
0
27 Jul 2024
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models
Chengzhengxu Li
Xiaoming Liu
Zhaohan Zhang
Yichen Wang
Chen Liu
Y. Lan
Chao Shen
57
2
0
15 Jun 2024
Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics
Fangru Lin
Daniel Altshuler
J. Pierrehumbert
38
1
0
04 Apr 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
Carlo Nicolini
Jacopo Staiano
Bruno Lepri
Raffaele Marino
MoE
26
1
0
13 Mar 2024
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis
Shashank Kotyan
Tatsuya Ueda
Danilo Vasconcellos Vargas
27
1
0
07 Dec 2023
On the Importance of Step-wise Embeddings for Heterogeneous Clinical Time-Series
Rita Kuznetsova
Alizée Pace
Manuel Burger
Hugo Yèche
Gunnar Rätsch
AI4TS
34
5
0
15 Nov 2023
AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer
Leixin Yang
Yu Xiang
23
0
0
22 Sep 2023
Nebula: Self-Attention for Dynamic Malware Analysis
Dmitrijs Trizna
Luca Demetrio
Battista Biggio
Fabio Roli
24
13
0
19 Sep 2023
Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models
A. A. Falaki
R. Gras
ViT
26
7
0
28 Aug 2023
Zero-Shot Text Classification via Self-Supervised Tuning
Chaoqun Liu
Wenxuan Zhang
Guizhen Chen
Xiaobao Wu
A. Luu
Chip Hong Chang
Lidong Bing
VLM
37
11
0
19 May 2023
A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification
Yunlong Feng
Bohan Li
Libo Qin
Xiao Xu
Wanxiang Che
6
3
0
18 Apr 2023
UKP-SQuARE v3: A Platform for Multi-Agent QA Research
Haritz Puerto
Tim Baumgärtner
Rachneet Sachdeva
Haishuo Fang
Haotian Zhang
Sewin Tariverdian
Kexin Wang
Iryna Gurevych
26
2
0
31 Mar 2023
Evaluating self-attention interpretability through human-grounded experimental protocol
Milan Bhan
Nina Achache
Victor Legrand
A. Blangero
N. Chesneau
26
9
0
27 Mar 2023
How Does Attention Work in Vision Transformers? A Visual Analytics Attempt
Yiran Li
Junpeng Wang
Xin Dai
Liang Wang
Chin-Chia Michael Yeh
Yan Zheng
Wei Zhang
Kwan-Liu Ma
ViT
20
23
0
24 Mar 2023
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings
Jan Engler
Sandipan Sikdar
Marlene Lutz
M. Strohmaier
24
7
0
11 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference
R. Brath
Daniel A. Keim
Johannes Knittel
Shimei Pan
Pia Sommerauer
Hendrik Strobelt
19
11
0
11 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
18
25
0
05 Jan 2023
Black-box language model explanation by context length probing
Ondřej Cífka
Antoine Liutkus
MILM
LRM
16
6
0
30 Dec 2022
PCRED: Zero-shot Relation Triplet Extraction with Potential Candidate Relation Selection and Entity Boundary Detection
Yuquan Lan
Dongxu Li
Yunqi Zhang
Hui Zhao
Gang Zhao
27
4
0
26 Nov 2022
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT
Siyuan Lu
Chenchen Zhou
Keli Xie
Jun Lin
Zhongfeng Wang
14
1
0
16 Nov 2022
Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations
A. Bhat
Ashutosh Modi
27
9
0
07 Nov 2022
MOFormer: Self-Supervised Transformer model for Metal-Organic Framework Property Prediction
Zhonglin Cao
Rishikesh Magar
Yuyang Wang
A. Farimani
AI4CE
23
88
0
25 Oct 2022
Hierarchical Multi-Interest Co-Network For Coarse-Grained Ranking
Xu Yuan
Chengjun Xu
Qiwei Chen
Tao Zhuang
Hongjie Chen
Chong Li
Junfeng Ge
AI4TS
25
0
0
19 Oct 2022
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling
Kalpa Gunaratna
Vijay Srinivasan
Akhila Yerukola
Hongxia Jin
23
6
0
19 Oct 2022
A Transformer-based deep neural network model for SSVEP classification
Jianbo Chen
Yangsong Zhang
Yudong Pan
Peng Xu
Cuntai Guan
22
50
0
09 Oct 2022
polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics
Christopher Kuenneth
R. Ramprasad
26
101
0
29 Sep 2022
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts
Timo Spinde
Manuel Plank
Jan-David Krieger
Terry Ruas
Bela Gipp
Akiko Aizawa
27
67
0
29 Sep 2022
Visual Comparison of Language Model Adaptation
Rita Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
41
16
0
17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
Hendrik Strobelt
Albert Webson
Victor Sanh
Benjamin Hoover
Johanna Beyer
Hanspeter Pfister
Alexander M. Rush
VLM
30
135
0
16 Aug 2022
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug Labeling to Enhance Product-Specific Guidance Assessment
Yiwen Shi
Jing Wang
Ping Ren
Taha ValizadehAslani
Yi Zhang
Meng Hu
Hualou Liang
AI4MH
AAML
22
16
0
25 Jul 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
47
6
0
09 Jul 2022
Astroconformer: Inferring Surface Gravity of Stars from Stellar Light Curves with Transformer
Jiashu Pan
Y. Ting 丁
Jie Yu
11
3
0
06 Jul 2022
Attention Flows for General Transformers
Niklas Metzger
Christopher Hahn
Julian Siber
Frederik Schmitt
Bernd Finkbeiner
34
0
0
30 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
20
8
0
08 May 2022
An Exploratory Study on Code Attention in BERT
Rishab Sharma
Fuxiang Chen
Fatemeh H. Fard
David Lo
19
25
0
05 Apr 2022
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
34
10
0
31 Mar 2022
Scientometric Review of Artificial Intelligence for Operations & Maintenance of Wind Turbines: The Past, Present and Future
Joyjit Chatterjee
Nina Dethlefs
26
83
0
30 Mar 2022
GRS: Combining Generation and Revision in Unsupervised Sentence Simplification
Mohammad Dehghan
Dhruv Kumar
Lukasz Golab
23
12
0
18 Mar 2022
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark
Yunhe Gao
Mu Zhou
Ding Liu
Zhennan Yan
Shaoting Zhang
Dimitris N. Metaxas
ViT
MedIm
20
68
0
28 Feb 2022
Do Transformers know symbolic rules, and would we know if they did?
Tommi Gröndahl
Yu-Wen Guo
Nirmal Asokan
25
0
0
19 Feb 2022
Punctuation restoration in Swedish through fine-tuned KB-BERT
J. Nilsson
13
0
0
14 Feb 2022
Pre-Trained Language Models for Interactive Decision-Making
Shuang Li
Xavier Puig
Chris Paxton
Yilun Du
Clinton Jia Wang
...
Anima Anandkumar
Jacob Andreas
Igor Mordatch
Antonio Torralba
Yuke Zhu
LM&Ro
34
246
0
03 Feb 2022
A Survey on Gender Bias in Natural Language Processing
Karolina Stañczak
Isabelle Augenstein
30
109
0
28 Dec 2021
Is "My Favorite New Movie" My Favorite Movie? Probing the Understanding of Recursive Noun Phrases
Qing Lyu
Hua Zheng
Daoxin Li
Li Zhang
Marianna Apidianaki
Chris Callison-Burch
24
4
0
15 Dec 2021
Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models
Jaromír Šavelka
Kevin D. Ashley
ELM
AILaw
29
10
0
14 Dec 2021
LMdiff: A Visual Diff Tool to Compare Language Models
Hendrik Strobelt
Benjamin Hoover
Arvind Satyanarayan
Sebastian Gehrmann
VLM
29
19
0
02 Nov 2021
1
2
Next