Rational Retrieval Acts: Leveraging Pragmatic Reasoning to Improve Sparse Retrieval

Current sparse neural information retrieval (IR) methods, and to a lesser extent more traditional models such as BM25, do not take into account the document collection and the complex interplay between different term weights when representing a single document. In this paper, we show how the Rational Speech Acts (RSA), a linguistics framework used to minimize the number of features to be communicated when identifying an object in a set, can be adapted to the IR case -- and in particular to the high number of potential features (here, tokens). RSA dynamically modulates token-document interactions by considering the influence of other documents in the dataset, better contrasting document representations. Experiments show that incorporating RSA consistently improves multiple sparse retrieval models and achieves state-of-the-art performance on out-of-domain datasets from the BEIR benchmark.this https URL
View on arXiv@article{satouf2025_2505.03676, title={ Rational Retrieval Acts: Leveraging Pragmatic Reasoning to Improve Sparse Retrieval }, author={ Arthur Satouf and Gabriel Ben Zenou and Benjamin Piwowarski and Habiboulaye Amadou Boubacar and Pablo Piantanida }, journal={arXiv preprint arXiv:2505.03676}, year={ 2025 } }