PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics
Capabilities

PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities

13 January 2024

Tankala Pavan Kalyan

Pushpak Bhattacharyya

Papers citing "PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities"

14 / 14 papers shown

Title
Re-evaluating Theory of Mind evaluation in large language models Jennifer Hu Felix Sosa T. Ullman 45 0 0 28 Feb 2025
Non-literal Understanding of Number Words by Language Models Polina Tsvilodub Kanishk Gandhi Haoran Zhao Jan-Philipp Fränken Michael Franke Noah D. Goodman ReLM 83 0 0 10 Feb 2025
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs Mohammad Aflah Khan Neemesh Yadav Sarah Masud Md. Shad Akhtar 74 0 0 16 Dec 2024
Are LLMs good pragmatic speakers? Mingyue Jian Siddharth Narayanaswamy 23 1 0 03 Nov 2024
MultiPragEval: Multilingual Pragmatic Evaluation of Large Language Models Dojun Park Jiwoo Lee Seohyun Park Hyeyun Jeong Youngeun Koo Soonha Hwang Seonwoo Park Sungeun Lee ELM 26 1 0 11 Jun 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions Polina Tsvilodub Paul Marty Sonia Ramotowska Jacopo Romoli Michael Franke 32 0 0 09 May 2024
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness? Ryan Liu T. Sumers Ishita Dasgupta Thomas L. Griffiths LLMAG 40 13 0 11 Feb 2024
Evaluating large language models' ability to understand metaphor and sarcasm using a screening test for Asperger syndrome Hiromu Yakura AI4MH 24 0 0 19 Sep 2023
Leveraging Large Language Models for Multiple Choice Question Answering Joshua Robinson Christopher Rytting David Wingate ELM 143 186 0 22 Oct 2022
Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates Aida Kostikova Benjamin Paassen Dominik Beese Ole Putz Gregor Wiedemann Steffen Eger 35 3 0 09 Oct 2022
FLUTE: Figurative Language Understanding through Textual Explanations Tuhin Chakrabarty Arkadiy Saakyan Debanjan Ghosh Smaranda Muresan 46 66 0 24 May 2022
NOPE: A Corpus of Naturally-Occurring Presuppositions in English Alicia Parrish Sebastian Schuster Alex Warstadt Omar Agha Soo-hwan Lee Zhuoye Zhao Sam Bowman Tal Linzen LRM 34 23 0 14 Sep 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies Mor Geva Daniel Khashabi Elad Segal Tushar Khot Dan Roth Jonathan Berant RALM 250 673 0 06 Jan 2021
Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering Najoung Kim Ellie Pavlick Burcu Karagol Ayan Deepak Ramachandran 70 43 0 02 Jan 2021