Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers

29 March 2021

Papers citing "Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers"

50 / 64 papers shown

Title
ABE: A Unified Framework for Robust and Faithful Attribution-Based Explainability Zhiyu Zhu Jiayu Zhang Zhibo Jin Fang Chen Jianlong Zhou FAtt 24 0 0 03 May 2025
Interpretable graph-based models on multimodal biomedical data integration: A technical review and benchmarking Alireza Sadeghi F. Hajati A. Argha Nigel H Lovell Min Yang Hamid Alinejad-Rokny 31 0 0 03 May 2025
Disentangling Visual Transformers: Patch-level Interpretability for Image Classification Guillaume Jeanneret Loïc Simon F. Jurie ViT 44 0 0 24 Feb 2025
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval Guanqi Zhan Yuanpei Liu Kai Han Weidi Xie Andrew Zisserman VLM 150 0 0 21 Feb 2025
Narrowing Information Bottleneck Theory for Multimodal Image-Text Representations Interpretability Zhiyu Zhu Zhibo Jin Jiayu Zhang Nan Yang Jiahao Huang Jianlong Zhou Fang Chen 39 0 0 16 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning Yibo Yan Shen Wang Jiahao Huo Jingheng Ye Zhendong Chu Xuming Hu Philip S. Yu Carla P. Gomes B. Selman Qingsong Wen LRM 121 9 0 05 Feb 2025
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable Shreyash Arya Sukrut Rao Moritz Bohle Bernt Schiele 68 2 0 28 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers Yunshan Zhong Yuyao Zhou Yuxin Zhang Shen Li Yong Li Fei Chao Zhanpeng Zeng Rongrong Ji MQ 94 0 0 31 Dec 2024
Where am I? Cross-View Geo-localization with Natural Language Descriptions Junyan Ye Honglin Lin Leyan Ou Dairong Chen Zihao Wang Conghui He Weijia Li Weijia Li 76 0 0 22 Dec 2024
Beyond Accuracy: On the Effects of Fine-tuning Towards Vision-Language Model's Prediction Rationality Qitong Wang Tang Li Kien X. Nguyen Xi Peng 82 0 0 17 Dec 2024
Expanding Event Modality Applications through a Robust CLIP-Based Encoder SungHeon Jeong Hanning Chen Sanggeon Yun Suhyeon Cho Wenjun Huang Xiangjian Liu Mohsen Imani 98 1 0 04 Dec 2024
Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens Zhangqi Jiang Junkai Chen Beier Zhu Tingjin Luo Yankun Shen Xu Yang 100 4 0 23 Nov 2024
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning John Wu David Wu Jimeng Sun 44 1 0 31 Oct 2024
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations Nick Jiang Anish Kachinthaya Suzie Petryk Yossi Gandelsman VLM 32 14 0 03 Oct 2024
Explainable Artificial Intelligence: A Survey of Needs, Techniques, Applications, and Future Direction Melkamu Mersha Khang Lam Joseph Wood Ali AlShami Jugal Kalita XAI AI4TS 67 28 0 30 Aug 2024
MePT: Multi-Representation Guided Prompt Tuning for Vision-Language Model Xinyang Wang Yi Yang Minfeng Zhu Kecheng Zheng Shi Liu Wei Chen VPVLM MLLM VLM 47 1 0 19 Aug 2024
Visual Agents as Fast and Slow Thinkers Guangyan Sun Mingyu Jin Zhenting Wang Cheng-Long Wang Siqi Ma Qifan Wang Ying Nian Wu Ying Nian Wu Dongfang Liu Dongfang Liu LLMAG LRM 77 13 0 16 Aug 2024
Human-inspired Explanations for Vision Transformers and Convolutional Neural Networks Mahadev Prasad Panda Matteo Tiezzi Martina Vilas Gemma Roig Bjoern M. Eskofier Dario Zanca ViT AAML 29 1 0 04 Aug 2024
Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning Lu Yu Hesong Li Ying Fu J. Weijer Changsheng Xu CLL 47 1 0 02 Aug 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models Ming-Kuan Wu Xinyue Cai Jiayi Ji Jiale Li Oucheng Huang Gen Luo Hao Fei Xiaoshuai Sun Rongrong Ji MLLM 42 7 0 31 Jul 2024
Inpainting the Gaps: A Novel Framework for Evaluating Explanation Methods in Vision Transformers Lokesh Badisa Sumohana S. Channappayya 40 0 0 17 Jun 2024
Concept-skill Transferability-based Data Selection for Large Vision-Language Models Jaewoo Lee Boyang Li Sung Ju Hwang VLM 35 8 0 16 Jun 2024
MambaLRP: Explaining Selective State Space Sequence Models F. Jafari G. Montavon Klaus-Robert Müller Oliver Eberle Mamba 54 9 0 11 Jun 2024
F-LMM: Grounding Frozen Large Multimodal Models Size Wu Sheng Jin Wenwei Zhang Lumin Xu Wentao Liu Wei Li Chen Change Loy MLLM 73 12 0 09 Jun 2024
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions Sanjay Kariyappa Freddy Lecue Saumitra Mishra Christopher Pond Daniele Magazzeni Manuela Veloso 37 1 0 03 Jun 2024
Explaining Text Similarity in Transformer Models Alexandros Vasileiou Oliver Eberle 43 7 0 10 May 2024
Exposing Text-Image Inconsistency Using Diffusion Models Mingzhen Huang Shan Jia Zhou Zhou Yan Ju Jialing Cai Siwei Lyu 38 7 0 28 Apr 2024
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection Jiaqi Zhu Shaofeng Cai Fang Deng Junran Wu Junran Wu 50 15 0 15 Apr 2024
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model Musashi Hinck M. L. Olson David Cobbley Shao-Yen Tseng Vasudev Lal VLM 32 10 0 29 Mar 2024
B-Cos Aligned Transformers Learn Human-Interpretable Features Manuel Tran Amal Lahiani Yashin Dicente Cid Melanie Boxberg Peter Lienemann C. Matek S. J. Wagner Fabian J. Theis Eldad Klaiman Tingying Peng MedIm ViT 13 2 0 16 Jan 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability Nir Yellinek Leonid Karlinsky Raja Giryes CoGe VLM 49 4 0 28 Dec 2023
Explainable Multi-Camera 3D Object Detection with Transformer-Based Saliency Maps Till Beemelmanns Wassim Zahr Lutz Eckstein 27 0 0 22 Dec 2023
Inspecting Explainability of Transformer Models with Additional Statistical Information Hoang C. Nguyen Haeil Lee Junmo Kim ViT 18 3 0 19 Nov 2023
Zero-shot Translation of Attention Patterns in VQA Models to Natural Language Leonard Salewski A. Sophia Koepke Hendrik P. A. Lensch Zeynep Akata 29 2 0 08 Nov 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models Yifan Hou Jiaoda Li Yu Fei Alessandro Stolfo Wangchunshu Zhou Guangtao Zeng Antoine Bosselut Mrinmaya Sachan LRM 30 39 0 23 Oct 2023
FLIP: Cross-domain Face Anti-spoofing with Language Guidance K. Srivatsan Muzammal Naseer Karthik Nandakumar CVBM 42 44 0 28 Sep 2023
PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords Yong-Hyeok Lee Namhyun Cho 24 18 0 31 Aug 2023
Bootstrap Fine-Grained Vision-Language Alignment for Unified Zero-Shot Anomaly Localization Hanqiu Deng Zhaoxiang Zhang Jinan Bao Xingyu Li VLM 25 4 0 30 Aug 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion Isha Rawal Alexander Matyasko Shantanu Jaiswal Basura Fernando Cheston Tan 21 1 0 15 Jun 2023
Towards Evaluating Explanations of Vision Transformers for Medical Imaging Piotr Komorowski Hubert Baniecki P. Biecek MedIm 31 27 0 12 Apr 2023
Weakly-Supervised Text-driven Contrastive Learning for Facial Behavior Understanding Xiang Zhang Taoyue Wang Xiaotian Li Huiyuan Yang L. Yin 42 9 0 31 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation Yangqiaoyu Zhou Kai-Lang Yao Wusuo Li MedIm 11 1 0 17 Mar 2023
Data Roaming and Quality Assessment for Composed Image Retrieval Matan Levy Rami Ben-Ari N. Darshan Dani Lischinski 35 23 0 16 Mar 2023
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models Hila Chefer Yuval Alaluf Yael Vinker Lior Wolf Daniel Cohen-Or DiffM 53 498 0 31 Jan 2023
Context-Aware Robust Fine-Tuning Xiaofeng Mao YueFeng Chen Xiaojun Jia Rong Zhang Hui Xue Zhao Li VLM CLIP 20 23 0 29 Nov 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation Omri Avrahami Thomas Hayes Oran Gafni Sonal Gupta Yaniv Taigman Devi Parikh Dani Lischinski Ohad Fried Xiaoyue Yin DiffM 32 203 0 25 Nov 2022
ViT-CX: Causal Explanation of Vision Transformers Weiyan Xie Xiao-hui Li Caleb Chen Cao Nevin L.Zhang ViT 24 17 0 06 Nov 2022
Multi-Scale Wavelet Transformer for Face Forgery Detection Jie Liu Jingjing Wang Peng Zhang Chunmao Wang Di Xie Shiliang Pu ViT CVBM 28 8 0 08 Oct 2022
Quantitative Metrics for Evaluating Explanations of Video DeepFake Detectors Federico Baldassarre Quentin Debard Gonzalo Fiz Pontiveros Tri Kurniawan Wijaya 36 4 0 07 Oct 2022
Minimalistic Unsupervised Learning with the Sparse Manifold Transform Yubei Chen Zeyu Yun Y. Ma Bruno A. Olshausen Yann LeCun 42 8 0 30 Sep 2022