21
0

For GPT-4 as with Humans: Information Structure Predicts Acceptability of Long-Distance Dependencies

Abstract

It remains debated how well any LM understands natural language or generates reliable metalinguistic judgments. Moreover, relatively little work has demonstrated that LMs can represent and respect subtle relationships between form and function proposed by linguists. We here focus on a particular such relationship established in recent work: English speakers' judgments about the information structure of canonical sentences predicts independently collected acceptability ratings on corresponding 'long distance dependency' [LDD] constructions, across a wide array of base constructions and multiple types of LDDs. To determine whether any LM captures this relationship, we probe GPT-4 on the same tasks used with humans and newthis http URLreveal reliable metalinguistic skill on the information structure and acceptability tasks, replicating a striking interaction between the two, despite the zero-shot, explicit nature of the tasks, and little to no chance of contamination [Studies 1a, 1b]. Study 2 manipulates the information structure of base sentences and confirms a causal relationship: increasing the prominence of a constituent in a context sentence increases the subsequent acceptability ratings on an LDD construction. The findings suggest a tight relationship between natural and GPT-4 generated English, and between information structure and syntax, which begs for further exploration.

View on arXiv
@article{cuneo2025_2505.09005,
  title={ For GPT-4 as with Humans: Information Structure Predicts Acceptability of Long-Distance Dependencies },
  author={ Nicole Cuneo and Eleanor Graves and Supantho Rakshit and Adele E. Goldberg },
  journal={arXiv preprint arXiv:2505.09005},
  year={ 2025 }
}
Comments on this paper