A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny

12 May 2025

Abstract

In this reproduction study, we revisit recent claims that self-attention implements kernel principal component analysis (KPCA) (Teo et al., 2024), positing that (i) value vectors $V$ capture the eigenvectors of the Gram matrix of the keys, and (ii) that self-attention projects queries onto the principal component axes of the key matrix $K$ in a feature space. Our analysis reveals three critical inconsistencies: (1) No alignment exists between learned self-attention value vectors and what is proposed in the KPCA perspective, with average similarity metrics (optimal cosine similarity $\leq 0.32$ , linear CKA (Centered Kernel Alignment) $\leq 0.11$ , kernel CKA $\leq 0.32$ ) indicating negligible correspondence; (2) Reported decreases in reconstruction loss $J_\text{proj}$ , arguably justifying the claim that the self-attention minimizes the projection error of KPCA, are misinterpreted, as the quantities involved differ by orders of magnitude ( $\sim\!10^3$ ); (3) Gram matrix eigenvalue statistics, introduced to justify that $V$ captures the eigenvector of the gram matrix, are irreproducible without undocumented implementation-specific adjustments. Across 10 transformer architectures, we conclude that the KPCA interpretation of self-attention lacks empirical support.

View on arXiv

@article{sarıtaş2025_2505.07908,
  title={ A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny },
  author={ Karahan Sarıtaş and Çağatay Yıldız },
  journal={arXiv preprint arXiv:2505.07908},
  year={ 2025 }
}

Comments on this paper