A Multiscale Visualization of Attention in the Transformer Model

12 June 2019

Papers citing "A Multiscale Visualization of Attention in the Transformer Model"

50 / 83 papers shown

Title
What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models Jan Kapar Niklas Koenen Martin Jullum 64 0 0 29 Apr 2025
Discovering Influential Neuron Path in Vision Transformers Yifan Wang Yifei Liu Yingdong Shi Chong Li Anqi Pang Sibei Yang Jingyi Yu Kan Ren ViT 69 0 0 12 Mar 2025
Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention Zhenyu Guo Wenguang Chen 46 0 0 01 Jan 2025
On the Role of Attention Heads in Large Language Model Safety Zhenhong Zhou Haiyang Yu Xinghua Zhang Rongwu Xu Fei Huang Kun Wang Yang Liu Fan Zhang Yongbin Li 59 5 0 17 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 40 10 0 27 Jul 2024
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models Chengzhengxu Li Xiaoming Liu Zhaohan Zhang Yichen Wang Chen Liu Y. Lan Chao Shen 57 2 0 15 Jun 2024
Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics Fangru Lin Daniel Altshuler J. Pierrehumbert 38 1 0 04 Apr 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models Carlo Nicolini Jacopo Staiano Bruno Lepri Raffaele Marino MoE 26 1 0 13 Mar 2024
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis Shashank Kotyan Tatsuya Ueda Danilo Vasconcellos Vargas 27 1 0 07 Dec 2023
On the Importance of Step-wise Embeddings for Heterogeneous Clinical Time-Series Rita Kuznetsova Alizée Pace Manuel Burger Hugo Yèche Gunnar Rätsch AI4TS 36 5 0 15 Nov 2023
AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer Leixin Yang Yu Xiang 23 0 0 22 Sep 2023
Nebula: Self-Attention for Dynamic Malware Analysis Dmitrijs Trizna Luca Demetrio Battista Biggio Fabio Roli 24 13 0 19 Sep 2023
Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models A. A. Falaki R. Gras ViT 26 7 0 28 Aug 2023
Zero-Shot Text Classification via Self-Supervised Tuning Chaoqun Liu Wenxuan Zhang Guizhen Chen Xiaobao Wu A. Luu Chip Hong Chang Lidong Bing VLM 37 11 0 19 May 2023
A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification Yunlong Feng Bohan Li Libo Qin Xiao Xu Wanxiang Che 6 3 0 18 Apr 2023
UKP-SQuARE v3: A Platform for Multi-Agent QA Research Haritz Puerto Tim Baumgärtner Rachneet Sachdeva Haishuo Fang Haotian Zhang Sewin Tariverdian Kexin Wang Iryna Gurevych 26 2 0 31 Mar 2023
Evaluating self-attention interpretability through human-grounded experimental protocol Milan Bhan Nina Achache Victor Legrand A. Blangero N. Chesneau 26 9 0 27 Mar 2023
How Does Attention Work in Vision Transformers? A Visual Analytics Attempt Yiran Li Junpeng Wang Xin Dai Liang Wang Chin-Chia Michael Yeh Yan Zheng Wei Zhang Kwan-Liu Ma ViT 20 23 0 24 Mar 2023
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings Jan Engler Sandipan Sikdar Marlene Lutz M. Strohmaier 26 7 0 11 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference R. Brath Daniel A. Keim Johannes Knittel Shimei Pan Pia Sommerauer Hendrik Strobelt 19 11 0 11 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention Shashanka Venkataramanan Amir Ghodrati Yuki M. Asano Fatih Porikli A. Habibian ViT 18 25 0 05 Jan 2023
Black-box language model explanation by context length probing Ondřej Cífka Antoine Liutkus MILM LRM 16 6 0 30 Dec 2022
PCRED: Zero-shot Relation Triplet Extraction with Potential Candidate Relation Selection and Entity Boundary Detection Yuquan Lan Dongxu Li Yunqi Zhang Hui Zhao Gang Zhao 27 4 0 26 Nov 2022
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT Siyuan Lu Chenchen Zhou Keli Xie Jun Lin Zhongfeng Wang 14 1 0 16 Nov 2022
Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations A. Bhat Ashutosh Modi 27 9 0 07 Nov 2022
MOFormer: Self-Supervised Transformer model for Metal-Organic Framework Property Prediction Zhonglin Cao Rishikesh Magar Yuyang Wang A. Farimani AI4CE 23 88 0 25 Oct 2022
Hierarchical Multi-Interest Co-Network For Coarse-Grained Ranking Xu Yuan Chengjun Xu Qiwei Chen Tao Zhuang Hongjie Chen Chong Li Junfeng Ge AI4TS 25 0 0 19 Oct 2022
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling Kalpa Gunaratna Vijay Srinivasan Akhila Yerukola Hongxia Jin 23 6 0 19 Oct 2022
A Transformer-based deep neural network model for SSVEP classification Jianbo Chen Yangsong Zhang Yudong Pan Peng Xu Cuntai Guan 22 50 0 09 Oct 2022
polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics Christopher Kuenneth R. Ramprasad 26 101 0 29 Sep 2022
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts Timo Spinde Manuel Plank Jan-David Krieger Terry Ruas Bela Gipp Akiko Aizawa 27 67 0 29 Sep 2022
Visual Comparison of Language Model Adaptation Rita Sevastjanova E. Cakmak Shauli Ravfogel Ryan Cotterell Mennatallah El-Assady VLM 41 16 0 17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models Hendrik Strobelt Albert Webson Victor Sanh Benjamin Hoover Johanna Beyer Hanspeter Pfister Alexander M. Rush VLM 30 135 0 16 Aug 2022
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Yiwen Shi Jing Wang Ping Ren Taha ValizadehAslani Yi Zhang Meng Hu Hualou Liang AI4MH AAML 22 16 0 25 Jul 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval Wenqiao Zhang Jiannan Guo Meng Li Haochen Shi Shengyu Zhang Juncheng Li Siliang Tang Yueting Zhuang 49 6 0 09 Jul 2022
Astroconformer: Inferring Surface Gravity of Stars from Stellar Light Curves with Transformer Jiashu Pan Y. Ting 丁 Jie Yu 13 3 0 06 Jul 2022
Attention Flows for General Transformers Niklas Metzger Christopher Hahn Julian Siber Frederik Schmitt Bernd Finkbeiner 34 0 0 30 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information Chiyu Feng Po-Chun Hsu Hung-yi Lee SSL 22 8 0 08 May 2022
An Exploratory Study on Code Attention in BERT Rishab Sharma Fuxiang Chen Fatemeh H. Fard David Lo 19 25 0 05 Apr 2022
Interpretation of Black Box NLP Models: A Survey Shivani Choudhary N. Chatterjee S. K. Saha FAtt 34 10 0 31 Mar 2022
Scientometric Review of Artificial Intelligence for Operations & Maintenance of Wind Turbines: The Past, Present and Future Joyjit Chatterjee Nina Dethlefs 26 83 0 30 Mar 2022
GRS: Combining Generation and Revision in Unsupervised Sentence Simplification Mohammad Dehghan Dhruv Kumar Lukasz Golab 23 12 0 18 Mar 2022
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark Yunhe Gao Mu Zhou Ding Liu Zhennan Yan Shaoting Zhang Dimitris N. Metaxas ViT MedIm 20 68 0 28 Feb 2022
Do Transformers know symbolic rules, and would we know if they did? Tommi Gröndahl Yu-Wen Guo Nirmal Asokan 25 0 0 19 Feb 2022
Punctuation restoration in Swedish through fine-tuned KB-BERT J. Nilsson 13 0 0 14 Feb 2022
Pre-Trained Language Models for Interactive Decision-Making Shuang Li Xavier Puig Chris Paxton Yilun Du Clinton Jia Wang ... Anima Anandkumar Jacob Andreas Igor Mordatch Antonio Torralba Yuke Zhu LM&Ro 34 246 0 03 Feb 2022
A Survey on Gender Bias in Natural Language Processing Karolina Stañczak Isabelle Augenstein 30 109 0 28 Dec 2021
Is "My Favorite New Movie" My Favorite Movie? Probing the Understanding of Recursive Noun Phrases Qing Lyu Hua Zheng Daoxin Li Li Zhang Marianna Apidianaki Chris Callison-Burch 24 4 0 15 Dec 2021
Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models Jaromír Šavelka Kevin D. Ashley ELM AILaw 29 10 0 14 Dec 2021
LMdiff: A Visual Diff Tool to Compare Language Models Hendrik Strobelt Benjamin Hoover Arvind Satyanarayan Sebastian Gehrmann VLM 29 19 0 02 Nov 2021