46

Variation is the Key: A Variation-Based Framework for LLM-Generated Text Detection

Xuecong Li
Xiaohong Li
Qiang Hu
Yao Zhang
Junjie Wang
Main:6 Pages
8 Figures
Bibliography:3 Pages
4 Tables
Abstract

Detecting text generated by large language models (LLMs) is crucial but challenging. Existing detectors depend on impractical assumptions, such as white-box settings, or solely rely on text-level features, leading to imprecise detection ability. In this paper, we propose a simple but effective and practical LLM-generated text detection method, VaryBalance. The core of VaryBalance is that, compared to LLM-generated texts, there is a greater difference between human texts and their rewritten version via LLMs. Leveraging this observation, VaryBalance quantifies this through mean standard deviation and distinguishes human texts and LLM-generated texts. Comprehensive experiments demonstrated that VaryBalance outperforms the state-of-the-art detectors, i.e., Binoculars, by up to 34.3\% in terms of AUROC, and maintains robustness against multiple generating models and languages.

View on arXiv
Comments on this paper