Memory Mosaics at scale

4 July 2025

Jianyu Zhang

Léon Bottou

CLL

ArXiv (abs)PDF HTML Github (57★)

Main:10 Pages

18 Figures

Bibliography:2 Pages

14 Tables

Appendix:9 Pages

Abstract

Memory Mosaics [Zhang et al., 2025], networks of associative memories, have demonstrated appealing compositional and in-context learning capabilities on medium-scale networks (GPT-2 scale) and synthetic small datasets. This work shows that these favorable properties remain when we scale memory mosaics to large language model sizes (llama-8B scale) and real-world datasets.

View on arXiv

Comments on this paper