Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.15679
Cited By
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers
29 March 2021
Hila Chefer
Shir Gur
Lior Wolf
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers"
50 / 64 papers shown
Title
ABE: A Unified Framework for Robust and Faithful Attribution-Based Explainability
Zhiyu Zhu
Jiayu Zhang
Zhibo Jin
Fang Chen
Jianlong Zhou
FAtt
24
0
0
03 May 2025
Interpretable graph-based models on multimodal biomedical data integration: A technical review and benchmarking
Alireza Sadeghi
F. Hajati
A. Argha
Nigel H Lovell
Min Yang
Hamid Alinejad-Rokny
31
0
0
03 May 2025
Disentangling Visual Transformers: Patch-level Interpretability for Image Classification
Guillaume Jeanneret
Loïc Simon
F. Jurie
ViT
44
0
0
24 Feb 2025
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
Guanqi Zhan
Yuanpei Liu
Kai Han
Weidi Xie
Andrew Zisserman
VLM
150
0
0
21 Feb 2025
Narrowing Information Bottleneck Theory for Multimodal Image-Text Representations Interpretability
Zhiyu Zhu
Zhibo Jin
Jiayu Zhang
Nan Yang
Jiahao Huang
Jianlong Zhou
Fang Chen
39
0
0
16 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Carla P. Gomes
B. Selman
Qingsong Wen
LRM
121
9
0
05 Feb 2025
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
Shreyash Arya
Sukrut Rao
Moritz Bohle
Bernt Schiele
68
2
0
28 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
94
0
0
31 Dec 2024
Where am I? Cross-View Geo-localization with Natural Language Descriptions
Junyan Ye
Honglin Lin
Leyan Ou
Dairong Chen
Zihao Wang
Conghui He
Weijia Li
Weijia Li
76
0
0
22 Dec 2024
Beyond Accuracy: On the Effects of Fine-tuning Towards Vision-Language Model's Prediction Rationality
Qitong Wang
Tang Li
Kien X. Nguyen
Xi Peng
82
0
0
17 Dec 2024
Expanding Event Modality Applications through a Robust CLIP-Based Encoder
SungHeon Jeong
Hanning Chen
Sanggeon Yun
Suhyeon Cho
Wenjun Huang
Xiangjian Liu
Mohsen Imani
98
1
0
04 Dec 2024
Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens
Zhangqi Jiang
Junkai Chen
Beier Zhu
Tingjin Luo
Yankun Shen
Xu Yang
100
4
0
23 Nov 2024
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning
John Wu
David Wu
Jimeng Sun
44
1
0
31 Oct 2024
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Nick Jiang
Anish Kachinthaya
Suzie Petryk
Yossi Gandelsman
VLM
32
14
0
03 Oct 2024
Explainable Artificial Intelligence: A Survey of Needs, Techniques, Applications, and Future Direction
Melkamu Mersha
Khang Lam
Joseph Wood
Ali AlShami
Jugal Kalita
XAI
AI4TS
67
28
0
30 Aug 2024
MePT: Multi-Representation Guided Prompt Tuning for Vision-Language Model
Xinyang Wang
Yi Yang
Minfeng Zhu
Kecheng Zheng
Shi Liu
Wei Chen
VPVLM
MLLM
VLM
47
1
0
19 Aug 2024
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAG
LRM
77
13
0
16 Aug 2024
Human-inspired Explanations for Vision Transformers and Convolutional Neural Networks
Mahadev Prasad Panda
Matteo Tiezzi
Martina Vilas
Gemma Roig
Bjoern M. Eskofier
Dario Zanca
ViT
AAML
29
1
0
04 Aug 2024
Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning
Lu Yu
Hesong Li
Ying Fu
J. Weijer
Changsheng Xu
CLL
47
1
0
02 Aug 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
Ming-Kuan Wu
Xinyue Cai
Jiayi Ji
Jiale Li
Oucheng Huang
Gen Luo
Hao Fei
Xiaoshuai Sun
Rongrong Ji
MLLM
42
7
0
31 Jul 2024
Inpainting the Gaps: A Novel Framework for Evaluating Explanation Methods in Vision Transformers
Lokesh Badisa
Sumohana S. Channappayya
40
0
0
17 Jun 2024
Concept-skill Transferability-based Data Selection for Large Vision-Language Models
Jaewoo Lee
Boyang Li
Sung Ju Hwang
VLM
35
8
0
16 Jun 2024
MambaLRP: Explaining Selective State Space Sequence Models
F. Jafari
G. Montavon
Klaus-Robert Müller
Oliver Eberle
Mamba
54
9
0
11 Jun 2024
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu
Sheng Jin
Wenwei Zhang
Lumin Xu
Wentao Liu
Wei Li
Chen Change Loy
MLLM
73
12
0
09 Jun 2024
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
Sanjay Kariyappa
Freddy Lecue
Saumitra Mishra
Christopher Pond
Daniele Magazzeni
Manuela Veloso
37
1
0
03 Jun 2024
Explaining Text Similarity in Transformer Models
Alexandros Vasileiou
Oliver Eberle
43
7
0
10 May 2024
Exposing Text-Image Inconsistency Using Diffusion Models
Mingzhen Huang
Shan Jia
Zhou Zhou
Yan Ju
Jialing Cai
Siwei Lyu
38
7
0
28 Apr 2024
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection
Jiaqi Zhu
Shaofeng Cai
Fang Deng
Junran Wu
Junran Wu
50
15
0
15 Apr 2024
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model
Musashi Hinck
M. L. Olson
David Cobbley
Shao-Yen Tseng
Vasudev Lal
VLM
32
10
0
29 Mar 2024
B-Cos Aligned Transformers Learn Human-Interpretable Features
Manuel Tran
Amal Lahiani
Yashin Dicente Cid
Melanie Boxberg
Peter Lienemann
C. Matek
S. J. Wagner
Fabian J. Theis
Eldad Klaiman
Tingying Peng
MedIm
ViT
13
2
0
16 Jan 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGe
VLM
49
4
0
28 Dec 2023
Explainable Multi-Camera 3D Object Detection with Transformer-Based Saliency Maps
Till Beemelmanns
Wassim Zahr
Lutz Eckstein
27
0
0
22 Dec 2023
Inspecting Explainability of Transformer Models with Additional Statistical Information
Hoang C. Nguyen
Haeil Lee
Junmo Kim
ViT
18
3
0
19 Nov 2023
Zero-shot Translation of Attention Patterns in VQA Models to Natural Language
Leonard Salewski
A. Sophia Koepke
Hendrik P. A. Lensch
Zeynep Akata
29
2
0
08 Nov 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Yifan Hou
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
30
39
0
23 Oct 2023
FLIP: Cross-domain Face Anti-spoofing with Language Guidance
K. Srivatsan
Muzammal Naseer
Karthik Nandakumar
CVBM
42
44
0
28 Sep 2023
PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords
Yong-Hyeok Lee
Namhyun Cho
24
18
0
31 Aug 2023
Bootstrap Fine-Grained Vision-Language Alignment for Unified Zero-Shot Anomaly Localization
Hanqiu Deng
Zhaoxiang Zhang
Jinan Bao
Xingyu Li
VLM
25
4
0
30 Aug 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
21
1
0
15 Jun 2023
Towards Evaluating Explanations of Vision Transformers for Medical Imaging
Piotr Komorowski
Hubert Baniecki
P. Biecek
MedIm
31
27
0
12 Apr 2023
Weakly-Supervised Text-driven Contrastive Learning for Facial Behavior Understanding
Xiang Zhang
Taoyue Wang
Xiaotian Li
Huiyuan Yang
L. Yin
42
9
0
31 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
11
1
0
17 Mar 2023
Data Roaming and Quality Assessment for Composed Image Retrieval
Matan Levy
Rami Ben-Ari
N. Darshan
Dani Lischinski
35
23
0
16 Mar 2023
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Hila Chefer
Yuval Alaluf
Yael Vinker
Lior Wolf
Daniel Cohen-Or
DiffM
53
498
0
31 Jan 2023
Context-Aware Robust Fine-Tuning
Xiaofeng Mao
YueFeng Chen
Xiaojun Jia
Rong Zhang
Hui Xue
Zhao Li
VLM
CLIP
20
23
0
29 Nov 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami
Thomas Hayes
Oran Gafni
Sonal Gupta
Yaniv Taigman
Devi Parikh
Dani Lischinski
Ohad Fried
Xiaoyue Yin
DiffM
32
203
0
25 Nov 2022
ViT-CX: Causal Explanation of Vision Transformers
Weiyan Xie
Xiao-hui Li
Caleb Chen Cao
Nevin L.Zhang
ViT
24
17
0
06 Nov 2022
Multi-Scale Wavelet Transformer for Face Forgery Detection
Jie Liu
Jingjing Wang
Peng Zhang
Chunmao Wang
Di Xie
Shiliang Pu
ViT
CVBM
28
8
0
08 Oct 2022
Quantitative Metrics for Evaluating Explanations of Video DeepFake Detectors
Federico Baldassarre
Quentin Debard
Gonzalo Fiz Pontiveros
Tri Kurniawan Wijaya
36
4
0
07 Oct 2022
Minimalistic Unsupervised Learning with the Sparse Manifold Transform
Yubei Chen
Zeyu Yun
Y. Ma
Bruno A. Olshausen
Yann LeCun
42
8
0
30 Sep 2022
1
2
Next