Transparent Semantic Spaces: A Categorical Approach to Explainable Word Embeddings

28 August 2025

Main:24 Pages

5 Figures

Bibliography:1 Pages

Abstract

The paper introduces a novel framework based on category theory to enhance the explainability of artificial intelligence systems, particularly focusing on word embeddings. Key topics include the construction of categories $ Ł_{T} $ and $ ¶_{T} $, providing schematic representations of the semantics of a text $ T $, and reframing the selection of the element with maximum probability as a categorical notion. Additionally, the monoidal category $ ¶_{T} $ is constructed to visualize various methods of extracting semantic information from $ T $, offering a dimension-agnostic definition of semantic spaces reliant solely on information within the text.Furthermore, the paper defines the categories of configurations $ \Conf $ and word embeddings $ \Emb $, accompanied by the concept of divergence as a decoration on $ \Emb $. It establishes a mathematically precise method for comparing word embeddings, demonstrating the equivalence between the GloVe and Word2Vec algorithms and the metric MDS algorithm, transitioning from neural network algorithms (black box) to a transparent framework. Finally, the paper presents a mathematical approach to computing biases before embedding and offers insights on mitigating biases at the semantic space level, advancing the field of explainable artificial intelligence.

View on arXiv

Comments on this paper