Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2306.08189
Cited By
Language models are not naysayers: An analysis of language models on negation benchmarks
14 June 2023
Thinh Hung Truong
Timothy Baldwin
Karin Verspoor
Trevor Cohn
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language models are not naysayers: An analysis of language models on negation benchmarks"
50 / 54 papers shown
Title
Music Recommendation with Large Language Models: Challenges, Opportunities, and Evaluation
Elena V. Epure
Yashar Deldjoo
Bruno Sguerra
Markus Schedl
Manuel Moussallam
96
0
0
20 Nov 2025
Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives
Kentaro Ozeki
Risako Ando
Takanobu Morishita
Hirohiko Abe
K. Mineshima
Mitsuhiro Okada
LRM
98
0
0
30 Oct 2025
Beyond Understanding: Evaluating the Pragmatic Gap in LLMs' Cultural Processing of Figurative Language
Mena Attia
Aashiq Muhamed
Mai AlKhamissi
Thamar Solorio
Mona Diab
57
0
0
27 Oct 2025
Learning "Partner-Aware" Collaborators in Multi-Party Collaboration
Abhijnan Nath
Nikhil Krishnaswamy
94
0
0
26 Oct 2025
The Impact of Negated Text on Hallucination with Large Language Models
Jaehyung Seo
Hyeonseok Moon
Heuiseok Lim
92
0
0
23 Oct 2025
FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Debarpan Bhattacharya
Apoorva Kulkarni
Sriram Ganapathy
225
0
0
20 Sep 2025
SinhalaMMLU: A Comprehensive Benchmark for Evaluating Multitask Language Understanding in Sinhala
Ashmari Pramodya
Nirasha Nelki
Heshan Shalinda
Chamila Liyanage
Yusuke Sakai
Randil Pushpananda
Ruvan Weerasinghe
Hidetaka Kamigaito
Taro Watanabe
LRM
79
0
0
03 Sep 2025
QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting
Nicole Cho
William Watson
Alec Koppel
Sumitra Ganesh
Manuela Veloso
AAML
120
0
0
22 Aug 2025
Large Language Models Do Not Simulate Human Psychology
Sarah Schröder
Thekla Morgenroth
Ulrike Kuhl
Valerie Vaquet
Benjamin Paaßen
112
7
0
09 Aug 2025
Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding
Yeonkyoung So
Gyuseong Lee
Sungmok Jung
Joonhak Lee
JiA Kang
Sangho Kim
Jaejin Lee
171
1
0
17 Jun 2025
Reasoning Models Are More Easily Gaslighted Than You Think
B. Zhu
Hailong Yin
Yue Yu
Yu Jiang
LRM
196
2
0
11 Jun 2025
TNG-CLIP:Training-Time Negation Data Generation for Negation Awareness of CLIP
Yuliang Cai
Jesse Thomason
Mohammad Rostami
VLM
183
0
0
24 May 2025
Reasoning Capabilities and Invariability of Large Language Models
Alessandro Raganato
Rafael Peñaloza
Marco Viviani
G. Pasi
ReLM
LRM
318
0
0
01 May 2025
Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs
Pengkun Jiao
Bin Zhu
Yue Yu
Chong-Wah Ngo
Yu Jiang
237
3
0
13 Apr 2025
Negation: A Pink Elephant in the Large Language Models' Room?
Tereza Vrabcová
Marek Kadlcík
Petr Sojka
Michal Štefánik
Michal Spiegel
407
2
0
28 Mar 2025
From No to Know: Taxonomy, Challenges, and Opportunities for Negation Understanding in Multimodal Foundation Models
Mayank Vatsa
Aparna Bharati
S. Mittal
Richa Singh
304
0
0
10 Feb 2025
Benchmarking Gaslighting Negation Attacks Against Multimodal Large Language Models
Bin Zhu
Hui yan Qi
Yinxuan Gui
Yue Yu
Chong-Wah Ngo
Ee-Peng Lim
987
5
0
31 Jan 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Elie Antoine
Frédéric Béchet
Géraldine Damnati
Philippe Langlais
308
1
0
29 Jan 2025
Generating Diverse Negations from Affirmative Sentences
Darian Rodriguez Vasquez
Afroditi Papadaki
197
0
0
30 Oct 2024
Is artificial intelligence still intelligence? LLMs generalize to novel adjective-noun pairs, but don't mimic the full human distribution
Hayley Ross
Kathryn Davidson
Najoung Kim
170
4
0
23 Oct 2024
Are LLMs Models of Distributional Semantics? A Case Study on Quantifiers
Zhang Enyan
Zewei Wang
Michael A. Lepori
Ellie Pavlick
Helena Aparicio
252
3
0
17 Oct 2024
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Xiaonan Jing
Srinivas Billa
Danny Godbout
HILM
287
3
0
16 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
223
23
0
09 Oct 2024
Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Pritika Ramu
Koustava Goswami
Apoorv Saxena
Balaji Vasan Srinivavsan
217
10
0
25 Sep 2024
Controlled LLM-based Reasoning for Clinical Trial Retrieval
Mael Jullien
Alex Bogatu
Harriet Unsworth
André Freitas
LRM
185
3
0
19 Sep 2024
NeIn: Telling What You Don't Want
Nhat-Tan Bui
Dinh-Hieu Hoang
Quoc-Huy Trinh
Minh-Triet Tran
Truong Nguyen
Susan Gauch
324
2
0
09 Sep 2024
Animate, or Inanimate, That is the Question for Large Language Models
Leonardo Ranaldi
Giulia Pucci
Fabio Massimo Zanzotto
167
0
0
12 Aug 2024
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
IEEE Working Conference on Mining Software Repositories (MSR), 2024
Toufique Ahmed
Premkumar Devanbu
Christoph Treude
Michael Pradel
287
43
0
10 Aug 2024
How and where does CLIP process negation?
Vincent Quantmeyer
Pablo Mosteiro
Albert Gatt
CoGe
178
10
0
15 Jul 2024
Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain
Davide Mazzaccara
A. Testoni
Raffaella Bernardi
175
14
0
25 Jun 2024
Is this a bad table? A Closer Look at the Evaluation of Table Generation from Text
Pritika Ramu
Aparna Garimella
Sambaran Bandyopadhyay
LMTD
170
8
0
21 Jun 2024
Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination
Jongyoon Song
Sangwon Yu
Sungroh Yoon
HILM
97
6
0
20 Jun 2024
Bag of Lies: Robustness in Continuous Pre-training BERT
I. Gevers
Walter Daelemans
186
1
0
14 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding
MohammadHossein Rezaei
Eduardo Blanco
158
4
0
11 Jun 2024
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Neeraj Varshney
Satyam Raj
Venkatesh Mishra
Agneet Chatterjee
Ritika Sarkar
Amir Saeidi
Chitta Baral
LRM
213
18
0
08 Jun 2024
Large Language Models Lack Understanding of Character Composition of Words
Andrew Shin
Kunitake Kaneko
337
17
0
18 May 2024
Challenges and Opportunities in Text Generation Explainability
Kenza Amara
Rita Sevastjanova
Mennatallah El-Assady
SILM
159
3
0
14 May 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions
Polina Tsvilodub
Paul Marty
Sonia Ramotowska
Jacopo Romoli
Michael Franke
143
4
0
09 May 2024
Language in Vivo vs. in Silico: Size Matters but Larger Language Models Still Do Not Comprehend Language on a Par with Humans Due to Impenetrable Semantic Reference
Vittoria Dentella
Fritz Guenther
Evelina Leivada
ELM
302
5
0
23 Apr 2024
Revisiting subword tokenization: A case study on affixal negation in large language models
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Thinh Hung Truong
Yulia Otmakhova
Karin Verspoor
Trevor Cohn
Timothy Baldwin
178
3
0
03 Apr 2024
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
Philipp Mondorf
Barbara Plank
ELM
LRM
LM&MA
288
85
0
02 Apr 2024
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
Philipp Mondorf
Barbara Plank
LRM
283
14
0
20 Feb 2024
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic
Fajri Koto
Jinyan Su
Sara Shatnawi
Jad Doughman
Abdelrahman Boda Sadallah
...
Neha Sengupta
Shady Shehata
Farah E. Shamout
Preslav Nakov
Timothy Baldwin
ELM
LRM
269
71
0
20 Feb 2024
Strong hallucinations from negation and how to fix them
Nicholas Asher
Swarnadeep Bhar
ReLM
LRM
136
9
0
16 Feb 2024
SyntaxShap: Syntax-aware Explainability Method for Text Generation
Kenza Amara
Rita Sevastjanova
Mennatallah El-Assady
148
5
0
14 Feb 2024
Exploring Group and Symmetry Principles in Large Language Models
Shima Imani
Hamid Palangi
LRM
168
1
0
09 Feb 2024
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics
Conference on Computational Natural Language Learning (CoNLL), 2023
Yuhan Zhang
Edward Gibson
Forrest Davis
260
7
0
02 Nov 2023
Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mengyu Ye
Tatsuki Kuribayashi
Jun Suzuki
Goro Kobayashi
Hiroaki Funayama
LRM
193
10
0
23 Oct 2023
Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges
Thilo Spinner
Rebecca Kehlbeck
Rita Sevastjanova
Tobias Stähle
Daniel A. Keim
Oliver Deussen
Andreas Spitz
Mennatallah El-Assady
172
3
0
17 Oct 2023
Trustworthy Formal Natural Language Specifications
SIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software (Onward!), 2023
Colin S. Gordon
Sergey Matskevich
HILM
129
3
0
05 Oct 2023
1
2
Next