24
0

A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny

Abstract

In this reproduction study, we revisit recent claims that self-attention implements kernel principal component analysis (KPCA) (Teo et al., 2024), positing that (i) value vectors VV capture the eigenvectors of the Gram matrix of the keys, and (ii) that self-attention projects queries onto the principal component axes of the key matrix KK in a feature space. Our analysis reveals three critical inconsistencies: (1) No alignment exists between learned self-attention value vectors and what is proposed in the KPCA perspective, with average similarity metrics (optimal cosine similarity 0.32\leq 0.32, linear CKA (Centered Kernel Alignment) 0.11\leq 0.11, kernel CKA 0.32\leq 0.32) indicating negligible correspondence; (2) Reported decreases in reconstruction loss JprojJ_\text{proj}, arguably justifying the claim that the self-attention minimizes the projection error of KPCA, are misinterpreted, as the quantities involved differ by orders of magnitude ( ⁣103\sim\!10^3); (3) Gram matrix eigenvalue statistics, introduced to justify that VV captures the eigenvector of the gram matrix, are irreproducible without undocumented implementation-specific adjustments. Across 10 transformer architectures, we conclude that the KPCA interpretation of self-attention lacks empirical support.

View on arXiv
@article{sarıtaş2025_2505.07908,
  title={ A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny },
  author={ Karahan Sarıtaş and Çağatay Yıldız },
  journal={arXiv preprint arXiv:2505.07908},
  year={ 2025 }
}
Comments on this paper