Un-considering Contextual Information: Assessing LLMs' Understanding of Indexical Elements

1 June 2025

Main:4 Pages

2 Figures

Bibliography:2 Pages

12 Tables

Appendix:12 Pages

Abstract

Large Language Models (LLMs) have demonstrated impressive performances in tasks related to coreference resolution. However, previous studies mostly assessed LLM performance on coreference resolution with nouns and third person pronouns. This study evaluates LLM performance on coreference resolution with indexical like I, you, here and tomorrow, which come with unique challenges due to their linguistic properties. We present the first study examining how LLMs interpret indexicals in English, releasing the English Indexical Dataset with 1600 multiple-choice questions. We evaluate pioneering LLMs, including GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, and DeepSeek V3. Our results reveal that LLMs exhibit an impressive performance with some indexicals (I), while struggling with others (you, here, tomorrow), and that syntactic cues (e.g. quotation) contribute to LLM performance with some indexicals, while they reduce performance with others. Code and data are available at:this https URL.

View on arXiv

@article{oguz2025_2506.01089,
  title={ Un-considering Contextual Information: Assessing LLMs' Understanding of Indexical Elements },
  author={ Metehan Oguz and Yavuz Bakman and Duygu Nur Yaldiz },
  journal={arXiv preprint arXiv:2506.01089},
  year={ 2025 }
}

Comments on this paper