CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

1 November 2022

Papers citing "CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation"

31 / 31 papers shown

Title
FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation Yulia Otmakhova Hung Thinh Truong Rahmad Mahendra Zenan Zhai Rongxin Zhu Daniel Beck Jey Han Lau ELM 59 0 0 24 Apr 2025
Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing Jihyun Janice Ahn Wenpeng Yin SILM LRM 58 1 0 02 Apr 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks Elie Antoine Frédéric Béchet Géraldine Damnati Philippe Langlais 51 1 0 29 Jan 2025
WinoPron: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case Vagrant Gautam Julius Steuer Eileen Bingert Ray Johns Anne Lauscher Dietrich Klakow 46 3 0 09 Sep 2024
NeIn: Telling What You Don't Want Nhat-Tan Bui Dinh-Hieu Hoang Quoc-Huy Trinh Minh-Triet Tran Truong Nguyen Susan Gauch 36 2 0 09 Sep 2024
Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain Davide Mazzaccara A. Testoni Raffaella Bernardi 21 2 0 25 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding MohammadHossein Rezaei Eduardo Blanco 37 1 0 11 Jun 2024
Natural Language Processing RELIES on Linguistics Juri Opitz Shira Wein Nathan Schneider AI4CE 44 7 0 09 May 2024
Interpreting Answers to Yes-No Questions in Dialogues from Multiple Domains Zijie Wang Farzana Rashid Eduardo Blanco 27 0 0 25 Apr 2024
Characterizing LLM Abstention Behavior in Science QA with Context Perturbations Bingbing Wen Bill Howe Lucy Lu Wang 23 8 0 18 Apr 2024
Revisiting subword tokenization: A case study on affixal negation in large language models Thinh Hung Truong Yulia Otmakhova Karin Verspoor Trevor Cohn Timothy Baldwin 29 2 0 03 Apr 2024
Revealing Trends in Datasets from the 2022 ACL and EMNLP Conferences Jesse Atuhurra Hidetaka Kamigaito 36 0 0 31 Mar 2024
Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding Ha-Thanh Nguyen Ken Satoh 42 2 0 02 Mar 2024
Multi-dimensional Evaluation of Empathetic Dialog Responses Zhichao Xu Jiepu Jiang 26 3 0 18 Feb 2024
Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension Akira Kawabata Saku Sugawara ELM 17 5 0 30 Nov 2023
Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness Ashim Gupta Rishanth Rajendhran Nathan Stringham Vivek Srikumar Ana Marasović AAML 31 3 0 16 Nov 2023
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning Nishant Balepur Shramay Palta Rachel Rudinger LRM 10 7 0 13 Nov 2023
Can You Follow Me? Testing Situational Understanding in ChatGPT Chenghao Yang Allyson Ettinger LRM LLMAG ELM 107 4 0 24 Oct 2023
How Much Consistency Is Your Accuracy Worth? Jacob K. Johnson Ana Marasović 9 0 0 20 Oct 2023
Evaluating Paraphrastic Robustness in Textual Entailment Models Dhruv Verma Yash Kumar Lal Shreyashee Sinha Benjamin Van Durme Adam Poliak 17 5 0 29 Jun 2023
Language models are not naysayers: An analysis of language models on negation benchmarks Thinh Hung Truong Timothy Baldwin Karin Verspoor Trevor Cohn 22 54 0 14 Jun 2023
Beyond Positive Scaling: How Negation Impacts Scaling Trends of Language Models Yuhui Zhang Michihiro Yasunaga Zhengping Zhou Jeff Z. HaoChen James Y. Zou Percy Liang Serena Yeung 30 7 0 27 May 2023
NevIR: Negation in Neural Information Retrieval Orion Weller Dawn J Lawrie Benjamin Van Durme MU 54 18 0 12 May 2023
Say What You Mean! Large Language Models Speak Too Positively about Negative Commonsense Knowledge Jiangjie Chen Wei Shi Ziquan Fu Sijie Cheng Lei Li Yanghua Xiao 21 46 0 10 May 2023
Natural Language Reasoning, A Survey Fei Yu Hongbo Zhang Prayag Tiwari Benyou Wang ReLM LRM 28 49 0 26 Mar 2023
Large Language Models Can Be Easily Distracted by Irrelevant Context Freda Shi Xinyun Chen Kanishka Misra Nathan Scales David Dohan Ed H. Chi Nathanael Scharli Denny Zhou ReLM RALM LRM 28 526 0 31 Jan 2023
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning Shayne Longpre Le Hou Tu Vu Albert Webson Hyung Won Chung ... Denny Zhou Quoc V. Le Barret Zoph Jason W. Wei Adam Roberts ALM 22 621 0 31 Jan 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,402 0 28 Jan 2022
FLEX: Unifying Evaluation for Few-Shot NLP Jonathan Bragg Arman Cohan Kyle Lo Iz Beltagy 197 104 0 15 Jul 2021
Multi-task Learning of Negation and Speculation for Targeted Sentiment Classification Andrew Moore Jeremy Barnes 33 9 0 16 Oct 2020