Language models are not naysayers: An analysis of language models on negation benchmarks

14 June 2023

Karin Verspoor

Papers citing "Language models are not naysayers: An analysis of language models on negation benchmarks"

50 / 51 papers shown

Title
Reasoning Capabilities and Invariability of Large Language Models Alessandro Raganato Rafael Peñaloza Marco Viviani G. Pasi ReLM LRM 80 0 0 01 May 2025
Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs Pengkun Jiao Bin Zhu Jingjing Chen Chong-Wah Ngo Yu Jiang 35 0 0 13 Apr 2025
Negation: A Pink Elephant in the Large Language Models' Room? Tereza Vrabcová Marek Kadlcík Petr Sojka Michal Štefánik Michal Spiegel 44 0 0 28 Mar 2025
From No to Know: Taxonomy, Challenges, and Opportunities for Negation Understanding in Multimodal Foundation Models Mayank Vatsa Aparna Bharati S. Mittal Richa Singh 53 0 0 10 Feb 2025
Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation Bin Zhu Hui yan Qi Yinxuan Gui Jingjing Chen Chong-Wah Ngo Ee-Peng Lim 77 1 0 31 Jan 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks Elie Antoine Frédéric Béchet Géraldine Damnati Philippe Langlais 51 1 0 29 Jan 2025
Generating Diverse Negations from Affirmative Sentences Darian Rodriguez Vasquez Afroditi Papadaki 37 0 0 30 Oct 2024
Is artificial intelligence still intelligence? LLMs generalize to novel adjective-noun pairs, but don't mimic the full human distribution Hayley Ross Kathryn Davidson Najoung Kim 23 2 0 23 Oct 2024
Are LLMs Models of Distributional Semantics? A Case Study on Quantifiers Zhang Enyan Zewei Wang Michael A. Lepori Ellie Pavlick Helena Aparicio 16 1 0 17 Oct 2024
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation Xiaonan Jing Srinivas Billa Danny Godbout HILM 35 0 0 16 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints Thomas Palmeira Ferraz Kartik Mehta Yu-Hsiang Lin Haw-Shiuan Chang Shereen Oraby Sijia Liu Vivek Subramanian Tagyoung Chung Mohit Bansal Nanyun Peng 48 7 0 09 Oct 2024
Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition Pritika Ramu Koustava Goswami Apoorv Saxena Balaji Vasan Srinivavsan 28 1 0 25 Sep 2024
Controlled LLM-based Reasoning for Clinical Trial Retrieval Mael Jullien Alex Bogatu Harriet Unsworth André Freitas LRM 22 0 0 19 Sep 2024
NeIn: Telling What You Don't Want Nhat-Tan Bui Dinh-Hieu Hoang Quoc-Huy Trinh Minh-Triet Tran Truong Nguyen Susan Gauch 31 2 0 09 Sep 2024
Animate, or Inanimate, That is the Question for Large Language Models Leonardo Ranaldi Giulia Pucci Fabio Massimo Zanzotto 37 0 0 12 Aug 2024
Can LLMs Replace Manual Annotation of Software Engineering Artifacts? Toufique Ahmed Premkumar Devanbu Christoph Treude Michael Pradel 68 10 0 10 Aug 2024
How and where does CLIP process negation? Vincent Quantmeyer Pablo Mosteiro Albert Gatt CoGe 24 6 0 15 Jul 2024
Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain Davide Mazzaccara A. Testoni Raffaella Bernardi 19 2 0 25 Jun 2024
Is this a bad table? A Closer Look at the Evaluation of Table Generation from Text Pritika Ramu Aparna Garimella Sambaran Bandyopadhyay LMTD 23 1 0 21 Jun 2024
Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination Jongyoon Song Sangwon Yu Sungroh Yoon HILM 23 3 0 20 Jun 2024
Bag of Lies: Robustness in Continuous Pre-training BERT I. Gevers Walter Daelemans 25 0 0 14 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding MohammadHossein Rezaei Eduardo Blanco 37 1 0 11 Jun 2024
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation Neeraj Varshney Satyam Raj Venkatesh Mishra Agneet Chatterjee Ritika Sarkar Amir Saeidi Chitta Baral LRM 26 7 0 08 Jun 2024
Large Language Models Lack Understanding of Character Composition of Words Andrew Shin Kunitake Kaneko 19 7 0 18 May 2024
Challenges and Opportunities in Text Generation Explainability Kenza Amara R. Sevastjanova Mennatallah El-Assady SILM 27 2 0 14 May 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions Polina Tsvilodub Paul Marty Sonia Ramotowska Jacopo Romoli Michael Franke 21 0 0 09 May 2024
Language in Vivo vs. in Silico: Size Matters but Larger Language Models Still Do Not Comprehend Language on a Par with Humans Vittoria Dentella Fritz Guenther Evelina Leivada ELM 24 1 0 23 Apr 2024
Revisiting subword tokenization: A case study on affixal negation in large language models Thinh Hung Truong Yulia Otmakhova Karin Verspoor Trevor Cohn Timothy Baldwin 24 2 0 03 Apr 2024
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey Philipp Mondorf Barbara Plank ELM LRM LM&MA 28 35 0 02 Apr 2024
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning Philipp Mondorf Barbara Plank LRM 19 9 0 20 Feb 2024
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic Fajri Koto Haonan Li Sara Shatnawi Jad Doughman Abdelrahman Boda Sadallah ... Neha Sengupta Shady Shehata Nizar Habash Preslav Nakov Timothy Baldwin ELM LRM 69 30 0 20 Feb 2024
Strong hallucinations from negation and how to fix them Nicholas Asher Swarnadeep Bhar ReLM LRM 21 3 0 16 Feb 2024
SyntaxShap: Syntax-aware Explainability Method for Text Generation Kenza Amara R. Sevastjanova Mennatallah El-Assady 29 2 0 14 Feb 2024
Exploring Group and Symmetry Principles in Large Language Models Shima Imani Hamid Palangi LRM 14 1 0 09 Feb 2024
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics Yuhan Zhang Edward Gibson Forrest Davis 14 6 0 02 Nov 2023
Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism Mengyu Ye Tatsuki Kuribayashi Jun Suzuki Goro Kobayashi Hiroaki Funayama LRM 21 8 0 23 Oct 2023
Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges Thilo Spinner Rebecca Kehlbeck R. Sevastjanova Tobias Stähle Daniel A. Keim Oliver Deussen Andreas Spitz Mennatallah El-Assady 14 2 0 17 Oct 2023
Trustworthy Formal Natural Language Specifications Colin S. Gordon Sergey Matskevich HILM 22 3 0 05 Oct 2023
Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings Chen Cecilia Liu Fajri Koto Timothy Baldwin Iryna Gurevych LRM 30 17 0 15 Sep 2023
Representation Synthesis by Probabilistic Many-Valued Logic Operation in Self-Supervised Learning Hiroki Nakamura Masashi Okada T. Taniguchi SSL NAI 18 0 0 08 Sep 2023
Not wacky vs. definitely wacky: A study of scalar adverbs in pretrained language models Isabelle Lorge J. Pierrehumbert 20 0 0 25 May 2023
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds Victoria Basmov Yoav Goldberg Reut Tsarfaty ReLM LRM 19 4 0 24 May 2023
Leveraging Large Language Models for Multiple Choice Question Answering Joshua Robinson Christopher Rytting David Wingate ELM 138 181 0 22 Oct 2022
Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation Thinh Hung Truong Yulia Otmakhova Tim Baldwin Trevor Cohn Jey Han Lau Karin Verspoor 55 21 0 06 Oct 2022
Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts Joel Jang Seonghyeon Ye Minjoon Seo ELM LRM 87 64 0 26 Sep 2022
Life after BERT: What do Other Muppets Understand about Language? Vladislav Lialin Kevin Zhao Namrata Shivagunde Anna Rumshisky 25 6 0 21 May 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 280 3,835 0 18 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 245 1,977 0 31 Dec 2020
Language Models as Knowledge Bases? Fabio Petroni Tim Rocktaschel Patrick Lewis A. Bakhtin Yuxiang Wu Alexander H. Miller Sebastian Riedel KELM AI4MH 404 2,576 0 03 Sep 2019