Comparing LLM-generated and human-authored news text using formal syntactic theory

2 June 2025

Main:8 Pages

11 Figures

Bibliography:3 Pages

13 Tables

Appendix:9 Pages

Abstract

This study provides the first comprehensive comparison of New York Times-style text generated by six large language models against real, human-authored NYT writing. The comparison is based on a formal syntactic theory. We use Head-driven Phrase Structure Grammar (HPSG) to analyze the grammatical structure of the texts. We then investigate and illustrate the differences in the distributions of HPSG grammar types, revealing systematic distinctions between human and LLM-generated writing. These findings contribute to a deeper understanding of the syntactic behavior of LLMs as well as humans, within the NYT genre.

View on arXiv

@article{zamaraeva2025_2506.01407,
  title={ Comparing LLM-generated and human-authored news text using formal syntactic theory },
  author={ Olga Zamaraeva and Dan Flickinger and Francis Bond and Carlos Gómez-Rodríguez },
  journal={arXiv preprint arXiv:2506.01407},
  year={ 2025 }
}

Comments on this paper