Self-Attention Attribution: Interpreting Information Interactions Inside Transformer

23 April 2020

Papers citing "Self-Attention Attribution: Interpreting Information Interactions Inside Transformer"

20 / 20 papers shown

Title
Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers Tobias Leemann Alina Fastowski Felix Pfeiffer Gjergji Kasneci 51 4 0 10 Jan 2025
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models Yanwen Huang Yong Zhang Ning Cheng Zhitao Li Shaojun Wang Jing Xiao 75 0 0 02 Jan 2025
Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks Samuele Poppi Zheng-Xin Yong Yifei He Bobbie Chern Han Zhao Aobo Yang Jianfeng Chi AAML 45 12 0 23 Oct 2024
Interpreting and Exploiting Functional Specialization in Multi-Head Attention under Multi-task Learning Chong Li Shaonan Wang Yunhao Zhang Jiajun Zhang Chengqing Zong 25 4 0 16 Oct 2023
Concise and Organized Perception Facilitates Reasoning in Large Language Models Junjie Liu Shaotian Yan Chen Shen Zhengdong Xiao Wenxiao Wang Jieping Ye Jieping Ye LRM 8 1 0 05 Oct 2023
Interpretability-Aware Vision Transformer Yao Qiang Chengyin Li Prashant Khanduri D. Zhu ViT 80 7 0 14 Sep 2023
Instruction Position Matters in Sequence Generation with Large Language Models Yanjun Liu Xianfeng Zeng Fandong Meng Jie Zhou LRM 35 8 0 23 Aug 2023
Causal Intersectionality and Dual Form of Gradient Descent for Multimodal Analysis: a Case Study on Hateful Memes Yosuke Miyanishi M. Nguyen 24 2 0 19 Aug 2023
LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias Mario Almagro Emilio Almazán Diego Ortego David Jiménez 16 3 0 06 Jul 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers Sotiris Anagnostidis Dario Pavllo Luca Biggio Lorenzo Noci Aurélien Lucchi Thomas Hofmann 32 53 0 25 May 2023
Interpretability in Activation Space Analysis of Transformers: A Focused Survey Soniya Vijayakumar AI4CE 22 3 0 22 Jan 2023
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling Kalpa Gunaratna Vijay Srinivasan Akhila Yerukola Hongxia Jin 17 6 0 19 Oct 2022
AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning Tao Yang Jinghao Deng Xiaojun Quan Qifan Wang Shaoliang Nie 20 3 0 12 Oct 2022
What does Transformer learn about source code? Kechi Zhang Ge Li Zhi Jin ViT 14 8 0 18 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm Jiangning Zhang Xiangtai Li Yabiao Wang Chengjie Wang Yibo Yang Yong Liu Dacheng Tao ViT 28 32 0 19 Jun 2022
Kformer: Knowledge Injection in Transformer Feed-Forward Layers Yunzhi Yao Shaohan Huang Li Dong Furu Wei Huajun Chen Ningyu Zhang KELM MedIm 16 42 0 15 Jan 2022
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs Philipp Benz Soomin Ham Chaoning Zhang Adil Karjauv In So Kweon AAML ViT 29 78 0 06 Oct 2021
Attributing Fair Decisions with Attention Interventions Ninareh Mehrabi Umang Gupta Fred Morstatter Greg Ver Steeg Aram Galstyan 16 21 0 08 Sep 2021
On the Robustness of Vision Transformers to Adversarial Examples Kaleel Mahmood Rigel Mahmood Marten van Dijk ViT 11 217 0 31 Mar 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,943 0 20 Apr 2018