ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.10217
12
3

A Language and Its Dimensions: Intrinsic Dimensions of Language Fractal Structures

16 November 2023
Vasilii A. Gromov
N. S. Borodin
A. S. Yerbolova
ArXivPDFHTML
Abstract

The present paper introduces a novel object of study - a language fractal structure. We hypothesize that a set of embeddings of all nnn-grams of a natural language constitutes a representative sample of this fractal set. (We use the term Hailonakea to refer to the sum total of all language fractal structures, over all nnn). The paper estimates intrinsic (genuine) dimensions of language fractal structures for the Russian and English languages. To this end, we employ methods based on (1) topological data analysis and (2) a minimum spanning tree of a data graph for a cloud of points considered (Steele theorem). For both languages, for all nnn, the intrinsic dimensions appear to be non-integer values (typical for fractal sets), close to 9 for both of the Russian and English language.

View on arXiv
Comments on this paper