Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1908.04211
Cited By
v1
v2
v3
v4 (latest)
On Identifiability in Transformers
International Conference on Learning Representations (ICLR), 2019
12 August 2019
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On Identifiability in Transformers"
50 / 128 papers shown
Order-Level Attention Similarity Across Language Models: A Latent Commonality
Jinglin Liang
Jin Zhong
Shuangping Huang
Yunqing Hu
Huiyuan Zhang
Huifang Li
Lixin Fan
Hanlin Gu
147
0
0
07 Nov 2025
Towards Transparent AI: A Survey on Explainable Language Models
Avash Palikhe
Sribala Vidyadhari Chinta
Zhipeng Yin
Rui Guo
Qiang Duan
Jie Yang
Wenbin Zhang
238
3
0
25 Sep 2025
SAEs Are Good for Steering -- If You Select the Right Features
Dana Arad
Aaron Mueller
Yonatan Belinkov
LLMSV
473
24
0
26 May 2025
LiDDA: Data Driven Attribution at LinkedIn
John Bencina
Erkut Aykutlug
Yue Chen
Zerui Zhang
Stephanie Sorenson
Shao Tang
Changshuai Wei
274
2
0
14 May 2025
Beyond Black-Box Predictions: Identifying Marginal Feature Effects in Tabular Transformer Networks
Anton Thielmann
Arik Reuter
Benjamin Saefken
LMTD
529
1
0
11 Apr 2025
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models
Guy Kaplan
Michael Toker
Yuval Reif
Yonatan Belinkov
Roy Schwartz
DiffM
430
2
0
01 Apr 2025
Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation
David Nader-Palacio
Dipin Khati
Daniel Rodríguez-Cárdenas
Alejandro Velasco
Denys Poshyvanyk
LRM
375
5
0
21 Mar 2025
Interpretable High-order Knowledge Graph Neural Network for Predicting Synthetic Lethality in Human Cancers
Xuexin Chen
Ruichu Cai
Zhengting Huang
Zijian Li
Jie Zheng
Min Wu
427
1
0
08 Mar 2025
Large Language Models Are Human-Like Internally
Tatsuki Kuribayashi
Yohei Oseki
Souhaib Ben Taieb
Kentaro Inui
Timothy Baldwin
659
18
0
03 Feb 2025
Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers
Tobias Leemann
Alina Fastowski
Felix Pfeiffer
Gjergji Kasneci
481
8
0
10 Jan 2025
Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models
Paweł Mąka
Yusuf Can Semerci
Jan Scholtes
Gerasimos Spanakis
294
1
0
15 Dec 2024
LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions
Computer Vision and Pattern Recognition (CVPR), 2024
Faridoun Mehri
Mahdieh Soleymani Baghshah
Mohammad Taher Pilehvar
406
3
0
24 Nov 2024
Unveiling Transformer Perception by Exploring Input Manifolds
A. Benfenati
Alfio Ferrara
A. Marta
Davide Riva
Elisabetta Rocchetti
405
0
0
08 Oct 2024
Mechanistic?
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Naomi Saphra
Sarah Wiegreffe
AI4CE
318
39
0
07 Oct 2024
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
International Conference on Learning Representations (ICLR), 2024
Hadas Orgad
Michael Toker
Zorik Gekhman
Roi Reichart
Idan Szpektor
Hadas Kotek
Yonatan Belinkov
HILM
AIFin
818
147
0
03 Oct 2024
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
International Conference on Applications of Natural Language to Data Bases (NLDB), 2024
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
246
5
0
19 Sep 2024
Stable Language Model Pre-training by Reducing Embedding Variability
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Woojin Chung
Jiwoo Hong
Na Min An
James Thorne
Se-Young Yun
209
6
0
12 Sep 2024
SRViT: Vision Transformers for Estimating Radar Reflectivity from Satellite Observations at Scale
Jason Stock
Kyle Hilburn
Imme Ebert-Uphoff
Charles Anderson
364
5
0
20 Jun 2024
Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
Junyi Wu
Bin Duan
Weitai Kang
Hao Tang
Yan Yan
242
19
0
21 Mar 2024
From Explainable to Interpretable Deep Learning for Natural Language Processing in Healthcare: How Far from Reality?
Computational and Structural Biotechnology Journal (CSBJ), 2024
Guangming Huang
Yingya Li
Shoaib Jameel
Yunfei Long
G. Papanastasiou
344
44
0
18 Mar 2024
SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning
Zhihao Wen
Jie Zhang
Yuan Fang
MoE
207
2
0
19 Feb 2024
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
International Conference on Machine Learning (ICML), 2024
Gianluigi Lopardo
F. Precioso
Damien Garreau
280
15
0
05 Feb 2024
eXplainable Bayesian Multi-Perspective Generative Retrieval
EuiYul Song
Philhoon Oh
Sangryul Kim
Hyunjung Shim
BDL
266
0
0
04 Feb 2024
Polynomial-based Self-Attention for Table Representation learning
International Conference on Machine Learning (ICML), 2023
Jayoung Kim
Yehjin Shin
Jeongwhan Choi
Hyowon Wi
Noseong Park
LMTD
213
3
0
12 Dec 2023
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars
Neural Information Processing Systems (NeurIPS), 2023
Kaiyue Wen
Yuchen Li
Bing Liu
Andrej Risteski
314
28
0
03 Dec 2023
Attention for Causal Relationship Discovery from Biological Neural Dynamics
Ziyu Lu
Anika Tabassum
Shruti R. Kulkarni
Lu Mi
J. Nathan Kutz
Eric Shea-Brown
Seung-Hwan Lim
CML
276
4
0
12 Nov 2023
Analyzing Vision Transformers for Image Classification in Class Embedding Space
Neural Information Processing Systems (NeurIPS), 2023
Martina G. Vilas
Timothy Schaumlöffel
Gemma Roig
ViT
257
34
0
29 Oct 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Buse Giledereli
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
429
65
0
23 Oct 2023
Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making
Yanrui Du
Sendong Zhao
Hao Wang
Yuhan Chen
Rui Bai
Zewen Qiang
Muzhen Cai
Bing Qin
181
1
0
20 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Stefan Arnold
Nils Kemmerzell
Annika Schreiner
264
0
0
17 Oct 2023
Interpreting and Exploiting Functional Specialization in Multi-Head Attention under Multi-task Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Chong Li
Shaonan Wang
Yunhao Zhang
Jiajun Zhang
Chengqing Zong
272
10
0
16 Oct 2023
Breaking through the learning plateaus of in-context learning in Transformer
International Conference on Machine Learning (ICML), 2023
Jingwen Fu
Tao Yang
Yuwang Wang
Yan Lu
Nanning Zheng
389
5
0
12 Sep 2023
Explainability for Large Language Models: A Survey
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
D. Yin
Jundong Li
LRM
535
762
0
02 Sep 2023
CUE: An Uncertainty Interpretation Framework for Text Classifiers Built on Pre-Trained Language Models
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Jiazheng Li
ZHAOYUE SUN
Bin Liang
Lin Gui
Yulan He
282
2
0
06 Jun 2023
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ali Modarressi
Mohsen Fayyaz
Ehsan Aghazadeh
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
337
37
0
05 Jun 2023
Explaining How Transformers Use Context to Build Predictions
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Javier Ferrando
Gerard I. Gállego
Ioannis Tsiamas
Marta R. Costa-jussá
187
53
0
21 May 2023
AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Siyue Wu
Hongzhan Chen
Xiaojun Quan
Qifan Wang
Rui Wang
VLM
413
31
0
17 May 2023
COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasks
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Fanny Jourdan
Agustin Picard
Thomas Fel
Laurent Risser
Jean-Michel Loubes
Nicholas M. Asher
259
16
0
11 May 2023
Computational modeling of semantic change
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Nina Tahmasebi
Haim Dubossarsky
344
8
0
13 Apr 2023
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps
International Conference on Learning Representations (ICLR), 2023
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
537
31
0
01 Feb 2023
Quantifying Context Mixing in Transformers
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Hosein Mohebbi
Willem H. Zuidema
Grzegorz Chrupała
Afra Alishahi
507
39
0
30 Jan 2023
Topics in Contextualised Attention Embeddings
European Conference on Information Retrieval (ECIR), 2023
Mozhgan Talebpour
A. G. S. D. Herrera
Shoaib Jameel
236
3
0
11 Jan 2023
On the Explainability of Natural Language Processing Deep Models
ACM Computing Surveys (ACM CSUR), 2022
Julia El Zini
M. Awad
264
114
0
13 Oct 2022
AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning
Neural Information Processing Systems (NeurIPS), 2022
Tao Yang
Jinghao Deng
Xiaojun Quan
Qifan Wang
Shaoliang Nie
202
7
0
12 Oct 2022
Automatic Evaluation and Analysis of Idioms in Neural Machine Translation
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Christos Baziotis
Prashant Mathur
Eva Hasler
182
15
0
10 Oct 2022
Interpreting County Level COVID-19 Infection and Feature Sensitivity using Deep Learning Time Series Models
Md. Khairul Islam
Di Zhu
Yingzheng Liu
Andrej Erkelens
Nick Daniello
Judy Fox
179
2
0
06 Oct 2022
Trigger-free Event Detection via Derangement Reading Comprehension
Jiachen Zhao
Haiqing Yang
178
2
0
20 Aug 2022
What does Transformer learn about source code?
Kechi Zhang
Ge Li
Zhi Jin
ViT
207
11
0
18 Jul 2022
eX-ViT: A Novel eXplainable Vision Transformer for Weakly Supervised Semantic Segmentation
Pattern Recognition (Pattern Recogn.), 2022
Lu Yu
Wei Xiang
Juan Fang
Yi-Ping Phoebe Chen
Lianhua Chi
ViT
230
32
0
12 Jul 2022
A Unified Understanding of Deep NLP Models for Text Classification
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2022
Zhuguo Li
Xiting Wang
Weikai Yang
Jing Wu
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
Hui Zhang
Shixia Liu
VLM
215
40
0
19 Jun 2022
1
2
3
Next
Page 1 of 3