A City of Millions: Mapping Literary Social Networks At Scale

26 February 2025

Sil Hamilton

Abstract

We release 70,509 high-quality social networks extracted from multilingual fiction and nonfiction narratives. We additionally provide metadata for $\sim$ 30,000 of these texts (73\% nonfiction and 27\% fiction) written between 1800 and 1999 in 58 languages. This dataset provides information on historical social worlds at an unprecedented scale, including data for 2,510,021 individuals in 2,805,482 pair-wise relationships annotated for affinity and relationship type. We achieve this scale by automating previously manual methods of extracting social networks; specifically, we adapt an existing annotation task as a language model prompt, ensuring consistency at scale with the use of structured output. This dataset serves as a unique resource for humanities and social science research by providing data on cognitive models of social realities.

View on arXiv

@article{hamilton2025_2502.19590,
  title={ A City of Millions: Mapping Literary Social Networks At Scale },
  author={ Sil Hamilton and Rebecca M. M. Hicke and David Mimno and Matthew Wilkens },
  journal={arXiv preprint arXiv:2502.19590},
  year={ 2025 }
}

Comments on this paper