Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.04211
Cited By
On Identifiability in Transformers
12 August 2019
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On Identifiability in Transformers"
32 / 32 papers shown
Title
Interpretable High-order Knowledge Graph Neural Network for Predicting Synthetic Lethality in Human Cancers
Xuexin Chen
Ruichu Cai
Zhengting Huang
Zijian Li
Jie Zheng
Min Wu
41
0
0
08 Mar 2025
Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers
Tobias Leemann
Alina Fastowski
Felix Pfeiffer
Gjergji Kasneci
51
4
0
10 Jan 2025
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
31
2
0
19 Sep 2024
Polynomial-based Self-Attention for Table Representation learning
Jayoung Kim
Yehjin Shin
Jeongwhan Choi
Hyowon Wi
Noseong Park
LMTD
19
2
0
12 Dec 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Yifan Hou
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
30
39
0
23 Oct 2023
Interpreting and Exploiting Functional Specialization in Multi-Head Attention under Multi-task Learning
Chong Li
Shaonan Wang
Yunhao Zhang
Jiajun Zhang
Chengqing Zong
25
4
0
16 Oct 2023
Explaining How Transformers Use Context to Build Predictions
Javier Ferrando
Gerard I. Gállego
Ioannis Tsiamas
Marta R. Costa-jussá
18
31
0
21 May 2023
Computational modeling of semantic change
Nina Tahmasebi
Haim Dubossarsky
26
6
0
13 Apr 2023
On the Explainability of Natural Language Processing Deep Models
Julia El Zini
M. Awad
25
82
0
13 Oct 2022
AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning
Tao Yang
Jinghao Deng
Xiaojun Quan
Qifan Wang
Shaoliang Nie
28
3
0
12 Oct 2022
Interpreting County Level COVID-19 Infection and Feature Sensitivity using Deep Learning Time Series Models
Md. Khairul Islam
Di Zhu
Yingzheng Liu
Andrej Erkelens
Nick Daniello
Judy Fox
20
1
0
06 Oct 2022
What does Transformer learn about source code?
Kechi Zhang
Ge Li
Zhi Jin
ViT
14
8
0
18 Jul 2022
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
Javier Ferrando
Gerard I. Gállego
Belen Alastruey
Carlos Escolano
Marta R. Costa-jussá
22
44
0
23 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
36
149
0
27 Apr 2022
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
21
48
0
08 Mar 2022
Vision Checklist: Towards Testable Error Analysis of Image Models to Help System Designers Interrogate Model Capabilities
Xin Du
Bénédicte Legastelois
B. Ganesh
A. Rajan
Hana Chockler
Vaishak Belle
Stuart Anderson
S. Ramamoorthy
AAML
19
6
0
27 Jan 2022
Explainable Deep Learning in Healthcare: A Methodological Survey from an Attribution View
Di Jin
Elena Sergeeva
W. Weng
Geeticka Chauhan
Peter Szolovits
OOD
31
54
0
05 Dec 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
17
44
0
20 Oct 2021
Relative Molecule Self-Attention Transformer
Lukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiñski
Jacek Tabor
Igor T. Podolak
Pawel M. Morkisz
Stanislaw Jastrzebski
MedIm
32
34
0
12 Oct 2021
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
158
46
0
15 Sep 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
G. Chrysostomou
Nikolaos Aletras
24
16
0
31 Aug 2021
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
40
20
0
09 May 2021
TabTransformer: Tabular Data Modeling Using Contextual Embeddings
Xin Huang
A. Khetan
Milan Cvitkovic
Zohar S. Karnin
ViT
LMTD
140
416
0
11 Dec 2020
Generating Plausible Counterfactual Explanations for Deep Transformers in Financial Text Classification
Linyi Yang
Eoin M. Kenny
T. L. J. Ng
Yi Yang
Barry Smyth
Ruihai Dong
13
70
0
23 Oct 2020
Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models
Joseph F DeRose
Jiayao Wang
M. Berger
15
83
0
03 Sep 2020
ConvBERT: Improving BERT with Span-based Dynamic Convolution
Zihang Jiang
Weihao Yu
Daquan Zhou
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
32
156
0
06 Aug 2020
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
6
45
0
22 Jun 2020
Accurate Word Alignment Induction from Neural Machine Translation
Yun-Nung Chen
Yang Liu
Guanhua Chen
Xin Jiang
Qun Liu
18
60
0
30 Apr 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
17
92
0
24 Feb 2020
Explaining Explanations: Axiomatic Feature Interactions for Deep Networks
Joseph D. Janizek
Pascal Sturmfels
Su-In Lee
FAtt
20
143
0
10 Feb 2020
What do you mean, BERT? Assessing BERT as a Distributional Semantics Model
Timothee Mickus
Denis Paperno
Mathieu Constant
Kees van Deemter
11
45
0
13 Nov 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1