ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.11279
  4. Cited By
Interpretability Beyond Feature Attribution: Quantitative Testing with
  Concept Activation Vectors (TCAV)

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

30 November 2017
Been Kim
Martin Wattenberg
Justin Gilmer
Carrie J. Cai
James Wexler
F. Viégas
Rory Sayres
    FAtt
ArXivPDFHTML

Papers citing "Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)"

50 / 1,045 papers shown
Title
Re-evaluating Theory of Mind evaluation in large language models
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
42
0
0
28 Feb 2025
Obtaining Example-Based Explanations from Deep Neural Networks
Obtaining Example-Based Explanations from Deep Neural Networks
Genghua Dong
Henrik Bostrom
Michalis Vazirgiannis
Roman Bresson
TDI
FAtt
XAI
98
0
0
27 Feb 2025
Interpreting CLIP with Hierarchical Sparse Autoencoders
Interpreting CLIP with Hierarchical Sparse Autoencoders
Vladimir Zaigrajew
Hubert Baniecki
P. Biecek
49
0
0
27 Feb 2025
QPM: Discrete Optimization for Globally Interpretable Image Classification
QPM: Discrete Optimization for Globally Interpretable Image Classification
Thomas Norrenbrock
T. Kaiser
Sovan Biswas
R. Manuvinakurike
Bodo Rosenhahn
55
0
0
27 Feb 2025
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models
Itay Benou
Tammy Riklin-Raviv
67
0
0
27 Feb 2025
BarkXAI: A Lightweight Post-Hoc Explainable Method for Tree Species Classification with Quantifiable Concepts
BarkXAI: A Lightweight Post-Hoc Explainable Method for Tree Species Classification with Quantifiable Concepts
Yunmei Huang
Songlin Hou
Zachary Nelson Horve
Songlin Fei
69
0
0
26 Feb 2025
Can LLMs Explain Themselves Counterfactually?
Can LLMs Explain Themselves Counterfactually?
Zahra Dehghanighobadi
Asja Fischer
Muhammad Bilal Zafar
LRM
38
0
0
25 Feb 2025
NeurFlow: Interpreting Neural Networks through Neuron Groups and Functional Interactions
NeurFlow: Interpreting Neural Networks through Neuron Groups and Functional Interactions
Tue Cao
Nhat X. Hoang
Hieu H. Pham
P. Nguyen
My T. Thai
83
0
0
22 Feb 2025
Language Models Can Predict Their Own Behavior
Language Models Can Predict Their Own Behavior
Dhananjay Ashok
Jonathan May
ReLM
AI4TS
LRM
58
0
0
18 Feb 2025
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships
Angie Boggust
Hyemin Bang
Hendrik Strobelt
Arvindmani Satyanarayan
65
1
0
17 Feb 2025
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
Z. He
Haiyan Zhao
Yiran Qiao
Fan Yang
Ali Payani
Jing Ma
Mengnan Du
LLMSV
68
2
0
17 Feb 2025
From Text to Trust: Empowering AI-assisted Decision Making with Adaptive LLM-powered Analysis
From Text to Trust: Empowering AI-assisted Decision Making with Adaptive LLM-powered Analysis
Zhuoyan Li
Hangxiao Zhu
Zhuoran Lu
Ziang Xiao
Ming Yin
47
0
0
17 Feb 2025
Suboptimal Shapley Value Explanations
Suboptimal Shapley Value Explanations
Xiaolei Lu
FAtt
65
0
0
17 Feb 2025
Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models
Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models
Samuel Stevens
Wei-Lun Chao
T. Berger-Wolf
Yu-Chuan Su
VLM
72
2
0
10 Feb 2025
Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions
H. Fokkema
T. Erven
Sara Magliacane
67
1
0
10 Feb 2025
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
111
7
0
06 Feb 2025
CoRPA: Adversarial Image Generation for Chest X-rays Using Concept Vector Perturbations and Generative Models
CoRPA: Adversarial Image Generation for Chest X-rays Using Concept Vector Perturbations and Generative Models
Amy Rafferty
Rishi Ramaesh
Ajitha Rajan
MedIm
AAML
56
0
0
04 Feb 2025
Compositional Concept-Based Neuron-Level Interpretability for Deep Reinforcement Learning
Compositional Concept-Based Neuron-Level Interpretability for Deep Reinforcement Learning
Zeyu Jiang
Hai Huang
Xingquan Zuo
OffRL
55
0
0
02 Feb 2025
Efficient and Interpretable Neural Networks Using Complex Lehmer Transform
M. Ataei
Xiaogang Wang
34
0
0
28 Jan 2025
Faithful Counterfactual Visual Explanations (FCVE)
Faithful Counterfactual Visual Explanations (FCVE)
Bismillah Khan
Syed Ali Tariq
Tehseen Zia
Muhammad Ahsan
David Windridge
44
0
0
12 Jan 2025
Towards Counterfactual and Contrastive Explainability and Transparency of DCNN Image Classifiers
Towards Counterfactual and Contrastive Explainability and Transparency of DCNN Image Classifiers
Syed Ali Tariq
Tehseen Zia
Mubeen Ghafoor
AAML
62
7
0
12 Jan 2025
COMIX: Compositional Explanations using Prototypes
COMIX: Compositional Explanations using Prototypes
S. Sivaprasad
D. Kangin
Plamen Angelov
Mario Fritz
139
0
0
10 Jan 2025
ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability
ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability
Antonin Poché
Alon Jacovi
Agustin Picard
Victor Boutin
Fanny Jourdan
37
2
0
10 Jan 2025
Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise Ratios
Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise Ratios
Marko Tuononen
Dani Korpi
Ville Hautamäki
FAtt
31
1
0
10 Jan 2025
Explaining the Behavior of Black-Box Prediction Algorithms with Causal Learning
Explaining the Behavior of Black-Box Prediction Algorithms with Causal Learning
Numair Sani
Daniel Malinsky
I. Shpitser
CML
76
15
0
10 Jan 2025
Analyzing Fine-tuning Representation Shift for Multimodal LLMs Steering alignment
Pegah Khayatan
Mustafa Shukor
Jayneel Parekh
Matthieu Cord
LLMSV
41
1
0
06 Jan 2025
Label-free Concept Based Multiple Instance Learning for Gigapixel Histopathology
Label-free Concept Based Multiple Instance Learning for Gigapixel Histopathology
Susu Sun
Leslie Tessier
Frédérique Meeuwsen
Clément Grisi
Dominique van Midden
G. Litjens
Christian F. Baumgartner
24
2
0
06 Jan 2025
Accurate Explanation Model for Image Classifiers using Class Association Embedding
Accurate Explanation Model for Image Classifiers using Class Association Embedding
Ruitao Xie
Jingbang Chen
Limai Jiang
Rui Xiao
Yi-Lun Pan
Yunpeng Cai
57
4
0
31 Dec 2024
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Konstantin Donhauser
Kristina Ulicna
Gemma Elyse Moran
Aditya Ravuri
Kian Kenyon-Dean
Cian Eastwood
Jason Hartford
76
0
0
20 Dec 2024
Adaptive Concept Bottleneck for Foundation Models Under Distribution
  Shifts
Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts
Jihye Choi
Jayaram Raghuram
Yixuan Li
Somesh Jha
108
4
0
18 Dec 2024
Concept-ROT: Poisoning Concepts in Large Language Models with Model
  Editing
Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing
Keltin Grimes
Marco Christiani
David Shriver
Marissa Connor
KELM
80
1
0
17 Dec 2024
Concept Learning in the Wild: Towards Algorithmic Understanding of
  Neural Networks
Concept Learning in the Wild: Towards Algorithmic Understanding of Neural Networks
Elad Shohama
Hadar Cohena
Khalil Wattada
Havana Rikab
Dan Vilenchik
70
1
0
15 Dec 2024
UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models
  for Universal Cross-Domain Retrieval
UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval
Haoyu Jiang
Zhi-Qi Cheng
Gabriel Moreira
Jiawen Zhu
Jingdong Sun
Bukun Ren
Jun-Yan He
Qi Dai
Xian-Sheng Hua
VLM
90
0
0
14 Dec 2024
OMENN: One Matrix to Explain Neural Networks
OMENN: One Matrix to Explain Neural Networks
Adam Wróbel
Mikołaj Janusz
Bartosz Zieliñski
Dawid Rymarczyk
FAtt
AAML
75
0
0
03 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
101
14
0
03 Dec 2024
Explaining the Impact of Training on Vision Models via Activation Clustering
Explaining the Impact of Training on Vision Models via Activation Clustering
Ahcène Boubekki
Samuel G. Fadel
Sebastian Mair
89
0
0
29 Nov 2024
Revisiting Marr in Face: The Building of 2D--2.5D--3D Representations in
  Deep Neural Networks
Revisiting Marr in Face: The Building of 2D--2.5D--3D Representations in Deep Neural Networks
Xiangyu Zhu
Chang Yu
Jiankuo Zhao
Zhaoxiang Zhang
Stan Z. Li
Zhen Lei
3DV
82
0
0
25 Nov 2024
FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing
  Interpretability in Chest X-Ray Report Generation
FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Generation
Trong-Thang Pham
Ngoc-Vuong Ho
Nhat-Tan Bui
T. Phan
Patel Brijesh
...
Gianfranco Doretto
Anh Nguyen
Carol C. Wu
Hien Nguyen
Ngan Le
92
2
0
23 Nov 2024
GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
Éloi Zablocki
Valentin Gerard
Amaia Cardiel
Eric Gaussier
Matthieu Cord
Eduardo Valle
79
0
0
23 Nov 2024
DEBUG-HD: Debugging TinyML models on-device using Hyper-Dimensional
  computing
DEBUG-HD: Debugging TinyML models on-device using Hyper-Dimensional computing
Nikhil P Ghanathe
Steven J E Wilton
28
0
0
16 Nov 2024
Explainable Artificial Intelligence for Medical Applications: A Review
Explainable Artificial Intelligence for Medical Applications: A Review
Qiyang Sun
Alican Akman
Björn Schuller
81
6
0
15 Nov 2024
Towards Utilising a Range of Neural Activations for Comprehending
  Representational Associations
Towards Utilising a Range of Neural Activations for Comprehending Representational Associations
Laura O'Mahony
Nikola S. Nikolov
David JP O'Sullivan
28
0
0
15 Nov 2024
Classification with Conceptual Safeguards
Classification with Conceptual Safeguards
Hailey Joren
Charles Marx
Berk Ustun
37
2
0
07 Nov 2024
Local vs distributed representations: What is the right basis for
  interpretability?
Local vs distributed representations: What is the right basis for interpretability?
Julien Colin
L. Goetschalckx
Thomas Fel
Victor Boutin
Jay Gopal
Thomas Serre
Nuria Oliver
HAI
34
2
0
06 Nov 2024
Decision Trees for Interpretable Clusters in Mixture Models and Deep
  Representations
Decision Trees for Interpretable Clusters in Mixture Models and Deep Representations
Maximilian Fleissner
Maedeh Zarvandi
D. Ghoshdastidar
29
1
0
03 Nov 2024
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting
  Rare Concepts in Foundation Models
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
Aashiq Muhamed
Mona Diab
Virginia Smith
38
2
0
01 Nov 2024
Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales
Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales
Tang Li
Mengmeng Ma
Xi Peng
37
2
0
31 Oct 2024
All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling
All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling
Emanuele Marconato
Sébastien Lachapelle
Sebastian Weichwald
Luigi Gresele
66
3
0
30 Oct 2024
Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Transformers
Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Transformers
Shaobo Wang
Hongxuan Tang
Mingyang Wang
H. Zhang
Xuyang Liu
Weiya Li
Xuming Hu
Linfeng Zhang
17
0
0
29 Oct 2024
Bridging the Gap between Expert and Language Models: Concept-guided Chess Commentary Generation and Evaluation
Bridging the Gap between Expert and Language Models: Concept-guided Chess Commentary Generation and Evaluation
Jaechang Kim
Jinmin Goh
Inseok Hwang
Jaewoong Cho
Jungseul Ok
ELM
28
1
0
28 Oct 2024
Previous
12345...192021
Next