Language models are not naysayers: An analysis of language models on negation benchmarks

14 June 2023

Karin Verspoor

Papers citing "Language models are not naysayers: An analysis of language models on negation benchmarks"

50 / 54 papers shown

Title
Music Recommendation with Large Language Models: Challenges, Opportunities, and Evaluation Elena V. Epure Yashar Deldjoo Bruno Sguerra Markus Schedl Manuel Moussallam 96 0 0 20 Nov 2025
Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives Kentaro Ozeki Risako Ando Takanobu Morishita Hirohiko Abe K. Mineshima Mitsuhiro Okada LRM 98 0 0 30 Oct 2025
Beyond Understanding: Evaluating the Pragmatic Gap in LLMs' Cultural Processing of Figurative Language Mena Attia Aashiq Muhamed Mai AlKhamissi Thamar Solorio Mona Diab 57 0 0 27 Oct 2025
Learning "Partner-Aware" Collaborators in Multi-Party Collaboration Abhijnan Nath Nikhil Krishnaswamy 94 0 0 26 Oct 2025
The Impact of Negated Text on Hallucination with Large Language Models Jaehyung Seo Hyeonseok Moon Heuiseok Lim 92 0 0 23 Oct 2025
FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2025 Debarpan Bhattacharya Apoorva Kulkarni Sriram Ganapathy 225 0 0 20 Sep 2025
SinhalaMMLU: A Comprehensive Benchmark for Evaluating Multitask Language Understanding in Sinhala Ashmari Pramodya Nirasha Nelki Heshan Shalinda Chamila Liyanage Yusuke Sakai Randil Pushpananda Ruvan Weerasinghe Hidetaka Kamigaito Taro Watanabe LRM 79 0 0 03 Sep 2025
QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting Nicole Cho William Watson Alec Koppel Sumitra Ganesh Manuela Veloso AAML 120 0 0 22 Aug 2025
Large Language Models Do Not Simulate Human Psychology Sarah Schröder Thekla Morgenroth Ulrike Kuhl Valerie Vaquet Benjamin Paaßen 112 7 0 09 Aug 2025
Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding Yeonkyoung So Gyuseong Lee Sungmok Jung Joonhak Lee JiA Kang Sangho Kim Jaejin Lee 171 1 0 17 Jun 2025
Reasoning Models Are More Easily Gaslighted Than You Think B. Zhu Hailong Yin Yue Yu Yu Jiang LRM 196 2 0 11 Jun 2025
TNG-CLIP:Training-Time Negation Data Generation for Negation Awareness of CLIP Yuliang Cai Jesse Thomason Mohammad Rostami VLM 183 0 0 24 May 2025
Reasoning Capabilities and Invariability of Large Language Models Alessandro Raganato Rafael Peñaloza Marco Viviani G. Pasi ReLM LRM 318 0 0 01 May 2025
Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs Pengkun Jiao Bin Zhu Yue Yu Chong-Wah Ngo Yu Jiang 237 3 0 13 Apr 2025
Negation: A Pink Elephant in the Large Language Models' Room? Tereza Vrabcová Marek Kadlcík Petr Sojka Michal Štefánik Michal Spiegel 407 2 0 28 Mar 2025
From No to Know: Taxonomy, Challenges, and Opportunities for Negation Understanding in Multimodal Foundation Models Mayank Vatsa Aparna Bharati S. Mittal Richa Singh 304 0 0 10 Feb 2025
Benchmarking Gaslighting Negation Attacks Against Multimodal Large Language Models Bin Zhu Hui yan Qi Yinxuan Gui Yue Yu Chong-Wah Ngo Ee-Peng Lim 987 5 0 31 Jan 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2025 Elie Antoine Frédéric Béchet Géraldine Damnati Philippe Langlais 308 1 0 29 Jan 2025
Generating Diverse Negations from Affirmative Sentences Darian Rodriguez Vasquez Afroditi Papadaki 197 0 0 30 Oct 2024
Is artificial intelligence still intelligence? LLMs generalize to novel adjective-noun pairs, but don't mimic the full human distribution Hayley Ross Kathryn Davidson Najoung Kim 170 4 0 23 Oct 2024
Are LLMs Models of Distributional Semantics? A Case Study on Quantifiers Zhang Enyan Zewei Wang Michael A. Lepori Ellie Pavlick Helena Aparicio 252 3 0 17 Oct 2024
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness EvaluationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Xiaonan Jing Srinivas Billa Danny Godbout HILM 287 3 0 16 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple ConstraintsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 Thomas Palmeira Ferraz Kartik Mehta Yu-Hsiang Lin Haw-Shiuan Chang Shereen Oraby Sijia Liu Vivek Subramanian Tagyoung Chung Mohit Bansal Nanyun Peng 223 23 0 09 Oct 2024
Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer DecompositionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 Pritika Ramu Koustava Goswami Apoorv Saxena Balaji Vasan Srinivavsan 217 10 0 25 Sep 2024
Controlled LLM-based Reasoning for Clinical Trial Retrieval Mael Jullien Alex Bogatu Harriet Unsworth André Freitas LRM 185 3 0 19 Sep 2024
NeIn: Telling What You Don't Want Nhat-Tan Bui Dinh-Hieu Hoang Quoc-Huy Trinh Minh-Triet Tran Truong Nguyen Susan Gauch 324 2 0 09 Sep 2024
Animate, or Inanimate, That is the Question for Large Language Models Leonardo Ranaldi Giulia Pucci Fabio Massimo Zanzotto 167 0 0 12 Aug 2024
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?IEEE Working Conference on Mining Software Repositories (MSR), 2024 Toufique Ahmed Premkumar Devanbu Christoph Treude Michael Pradel 287 43 0 10 Aug 2024
How and where does CLIP process negation? Vincent Quantmeyer Pablo Mosteiro Albert Gatt CoGe 178 10 0 15 Jul 2024
Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain Davide Mazzaccara A. Testoni Raffaella Bernardi 175 14 0 25 Jun 2024
Is this a bad table? A Closer Look at the Evaluation of Table Generation from Text Pritika Ramu Aparna Garimella Sambaran Bandyopadhyay LMTD 170 8 0 21 Jun 2024
Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination Jongyoon Song Sangwon Yu Sungroh Yoon HILM 97 6 0 20 Jun 2024
Bag of Lies: Robustness in Continuous Pre-training BERT I. Gevers Walter Daelemans 186 1 0 14 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding MohammadHossein Rezaei Eduardo Blanco 158 4 0 11 Jun 2024
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation Neeraj Varshney Satyam Raj Venkatesh Mishra Agneet Chatterjee Ritika Sarkar Amir Saeidi Chitta Baral LRM 213 18 0 08 Jun 2024
Large Language Models Lack Understanding of Character Composition of Words Andrew Shin Kunitake Kaneko 337 17 0 18 May 2024
Challenges and Opportunities in Text Generation Explainability Kenza Amara Rita Sevastjanova Mennatallah El-Assady SILM 159 3 0 14 May 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions Polina Tsvilodub Paul Marty Sonia Ramotowska Jacopo Romoli Michael Franke 143 4 0 09 May 2024
Language in Vivo vs. in Silico: Size Matters but Larger Language Models Still Do Not Comprehend Language on a Par with Humans Due to Impenetrable Semantic Reference Vittoria Dentella Fritz Guenther Evelina Leivada ELM 302 5 0 23 Apr 2024
Revisiting subword tokenization: A case study on affixal negation in large language modelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Thinh Hung Truong Yulia Otmakhova Karin Verspoor Trevor Cohn Timothy Baldwin 178 3 0 03 Apr 2024
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey Philipp Mondorf Barbara Plank ELM LRM LM&MA 288 85 0 02 Apr 2024
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning Philipp Mondorf Barbara Plank LRM 283 14 0 20 Feb 2024
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic Fajri Koto Jinyan Su Sara Shatnawi Jad Doughman Abdelrahman Boda Sadallah ... Neha Sengupta Shady Shehata Farah E. Shamout Preslav Nakov Timothy Baldwin ELM LRM 269 71 0 20 Feb 2024
Strong hallucinations from negation and how to fix them Nicholas Asher Swarnadeep Bhar ReLM LRM 136 9 0 16 Feb 2024
SyntaxShap: Syntax-aware Explainability Method for Text Generation Kenza Amara Rita Sevastjanova Mennatallah El-Assady 148 5 0 14 Feb 2024
Exploring Group and Symmetry Principles in Large Language Models Shima Imani Hamid Palangi LRM 168 1 0 09 Feb 2024
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with SemanticsConference on Computational Natural Language Learning (CoNLL), 2023 Yuhan Zhang Edward Gibson Forrest Davis 260 7 0 02 Nov 2023
Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on SyllogismConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Mengyu Ye Tatsuki Kuribayashi Jun Suzuki Goro Kobayashi Hiroaki Funayama LRM 193 10 0 23 Oct 2023
Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges Thilo Spinner Rebecca Kehlbeck Rita Sevastjanova Tobias Stähle Daniel A. Keim Oliver Deussen Andreas Spitz Mennatallah El-Assady 172 3 0 17 Oct 2023
Trustworthy Formal Natural Language SpecificationsSIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software (Onward!), 2023 Colin S. Gordon Sergey Matskevich HILM 129 3 0 05 Oct 2023