Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.17247
Cited By
VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
30 March 2022
Estelle Aflalo
Meng Du
Shao-Yen Tseng
Yongfei Liu
Chenfei Wu
Nan Duan
Vasudev Lal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers"
29 / 29 papers shown
Title
Exploring the Effectiveness and Interpretability of Texts in LLM-based Time Series Models
Zhengke Sun
Hangwei Qian
Ivor Tsang
AI4TS
29
0
0
09 Apr 2025
Quantifying Interpretability in CLIP Models with Concept Consistency
Avinash Madasu
Vasudev Lal
Phillip Howard
VLM
67
0
0
14 Mar 2025
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Seil Kang
Jinyeong Kim
Junhyeok Kim
Seong Jae Hwang
VLM
107
5
0
05 Mar 2025
Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
Peng Jin
H. Li
Li Yuan
Shuicheng Yan
Jie Chen
48
1
0
31 Dec 2024
ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling
William Jongwon Han
Chaojing Duan
M. Rosenberg
Emerson Liu
Ding Zhao
67
0
0
18 Dec 2024
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
J. Liu
N. Shah
Ping Chen
85
2
0
18 Dec 2024
Quantifying and Enabling the Interpretability of CLIP-like Models
Avinash Madasu
Yossi Gandelsman
Vasudev Lal
Phillip Howard
VLM
48
2
0
10 Sep 2024
HistoGym: A Reinforcement Learning Environment for Histopathological Image Analysis
Zhi-Bo Liu
Xiaobo Pang
Jizhao Wang
Shuai Liu
Chen Li
37
1
0
16 Aug 2024
MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model
Jiahao Huo
Yibo Yan
Boren Hu
Yutao Yue
Xuming Hu
LRM
MLLM
32
7
0
17 Jun 2024
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Linli Yao
Lei Li
Shuhuai Ren
Lean Wang
Yuanxin Liu
Xu Sun
Lu Hou
35
28
0
31 May 2024
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
40
111
0
22 Apr 2024
LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models
Gabriela Ben-Melech Stan
Estelle Aflalo
R. Y. Rohekar
Anahita Bhiwandiwalla
Shao-Yen Tseng
M. L. Olson
Yaniv Gurwicz
Chenfei Wu
Nan Duan
Vasudev Lal
83
6
0
03 Apr 2024
Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided Masking
Ivana Beňová
Jana Kosecka
Michal Gregor
Martin Tamajka
Marcel Veselý
Marián Simko
26
1
0
29 Jan 2024
Demonstration of an Adversarial Attack Against a Multimodal Vision Language Model for Pathology Imaging
Poojitha Thota
Jai Prakash Veerla
Partha Sai Guttikonda
M. Nasr
Shirin Nilizadeh
Jacob M. Luber
AAML
22
6
0
04 Jan 2024
SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology
S. Kapse
Pushpak Pati
Srijan Das
Jingwei Zhang
Chao Chen
Maria Vakalopoulou
Joel H. Saltz
Dimitris Samaras
Rajarsi R. Gupta
Prateek Prasanna
23
10
0
22 Dec 2023
Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning
Yiran Li
Junpeng Wang
Prince Osei Aboagye
Michael Yeh
Yan Zheng
Liang Wang
Wei Zhang
Kwan-Liu Ma
11
2
0
02 Nov 2023
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?
Yichi Zhang
Jiayi Pan
Yuchen Zhou
Rui Pan
Joyce Chai
VLM
19
13
0
31 Oct 2023
Explainable Techniques for Analyzing Flow Cytometry Cell Transformers
Florian Kowarsch
Lisa Weijler
Florian Kleber
Matthias Wödlinger
Michael Reiter
Margarita Maurer-Granofszky
Michael N. Dworzak
MedIm
17
0
0
27 Jul 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
21
1
0
15 Jun 2023
TG-VQA: Ternary Game of Video Question Answering
Hao Li
Peng Jin
Ze-Long Cheng
Songyang Zhang
Kai-xiang Chen
Zhennan Wang
Chang-rui Liu
Jie Chen
26
10
0
17 May 2023
Semantic Composition in Visually Grounded Language Models
Rohan Pandey
CoGe
16
1
0
15 May 2023
AttentionViz: A Global View of Transformer Attention
Catherine Yeh
Yida Chen
Aoyu Wu
Cynthia Chen
Fernanda Viégas
Martin Wattenberg
ViT
33
51
0
04 May 2023
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
Guillaume Jaume
Anurag J. Vaidya
Richard J. Chen
Drew F. K. Williamson
Paul Pu Liang
Faisal Mahmood
33
41
0
13 Apr 2023
How Does Attention Work in Vision Transformers? A Visual Analytics Attempt
Yiran Li
Junpeng Wang
Xin Dai
Liang Wang
Chin-Chia Michael Yeh
Yan Zheng
Wei Zhang
Kwan-Liu Ma
ViT
18
23
0
24 Mar 2023
Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment
Rohan Pandey
Rulin Shao
Paul Pu Liang
Ruslan Salakhutdinov
Louis-Philippe Morency
21
12
0
20 Dec 2022
Deep Learning based Computer Vision Methods for Complex Traffic Environments Perception: A Review
Talha Azfar
Jinlong Li
Hongkai Yu
R. Cheu
Yisheng Lv
Ruimin Ke
20
21
0
09 Nov 2022
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
16
60
0
07 Sep 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
525
0
13 Jun 2022
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation
Yongfei Liu
Chenfei Wu
Shao-Yen Tseng
Vasudev Lal
Xuming He
Nan Duan
CLIP
VLM
47
28
0
22 Sep 2021
1