A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny

In this reproduction study, we revisit recent claims that self-attention implements kernel principal component analysis (KPCA) (Teo et al., 2024), positing that (i) value vectors capture the eigenvectors of the Gram matrix of the keys, and (ii) that self-attention projects queries onto the principal component axes of the key matrix in a feature space. Our analysis reveals three critical inconsistencies: (1) No alignment exists between learned self-attention value vectors and what is proposed in the KPCA perspective, with average similarity metrics (optimal cosine similarity , linear CKA (Centered Kernel Alignment) , kernel CKA ) indicating negligible correspondence; (2) Reported decreases in reconstruction loss , arguably justifying the claim that the self-attention minimizes the projection error of KPCA, are misinterpreted, as the quantities involved differ by orders of magnitude (); (3) Gram matrix eigenvalue statistics, introduced to justify that captures the eigenvector of the gram matrix, are irreproducible without undocumented implementation-specific adjustments. Across 10 transformer architectures, we conclude that the KPCA interpretation of self-attention lacks empirical support.
View on arXiv@article{sarıtaş2025_2505.07908, title={ A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny }, author={ Karahan Sarıtaş and Çağatay Yıldız }, journal={arXiv preprint arXiv:2505.07908}, year={ 2025 } }