LLM Watermarking Using Mixtures and Statistical-to-Computational Gaps

2 May 2025

Pedro Abdalla

Roman Vershynin

WaLM

ArXiv (abs)PDF HTML

Main:18 Pages

Bibliography:3 Pages

Abstract

Given a text, can we determine whether it was generated by a large language model (LLM) or by a human? A widely studied approach to this problem is watermarking. We propose an undetectable and elementary watermarking scheme in the closed setting. Also, in the harder open setting, where the adversary has access to most of the model, we propose an unremovable watermarking scheme.

View on arXiv

Comments on this paper