44
0

Language Models May Verbatim Complete Text They Were Not Explicitly Trained On

Abstract

An important question today is whether a given text was used to train a large language model (LLM). A \emph{completion} test is often employed: check if the LLM completes a sufficiently complex text. This, however, requires a ground-truth definition of membership; most commonly, it is defined as a member based on the nn-gram overlap between the target text and any text in the dataset. In this work, we demonstrate that this nn-gram based membership definition can be effectively gamed. We study scenarios where sequences are \emph{non-members} for a given nn and we find that completion tests still succeed. We find many natural cases of this phenomenon by retraining LLMs from scratch after removing all training samples that were completed; these cases include exact duplicates, near-duplicates, and even short overlaps. They showcase that it is difficult to find a single viable choice of nn for membership definitions. Using these insights, we design adversarial datasets that can cause a given target sequence to be completed without containing it, for any reasonable choice of nn. Our findings highlight the inadequacy of nn-gram membership, suggesting membership definitions fail to account for auxiliary information available to the training algorithm.

View on arXiv
@article{liu2025_2503.17514,
  title={ Language Models May Verbatim Complete Text They Were Not Explicitly Trained On },
  author={ Ken Ziyu Liu and Christopher A. Choquette-Choo and Matthew Jagielski and Peter Kairouz and Sanmi Koyejo and Percy Liang and Nicolas Papernot },
  journal={arXiv preprint arXiv:2503.17514},
  year={ 2025 }
}
Comments on this paper