Benchmarks for Automated Commonsense Reasoning: A Survey

9 February 2023

Papers citing "Benchmarks for Automated Commonsense Reasoning: A Survey"

19 / 19 papers shown

Title
DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts Tobias Braun Mark Rothermel Marcus Rohrbach Anna Rohrbach 83 1 0 13 Dec 2024
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents Zhe Su Xuhui Zhou Sanketh Rangreji Anubha Kabra Julia Mendelsohn Faeze Brahman Maarten Sap LLMAG 95 2 0 13 Sep 2024
ACCORD: Closing the Commonsense Measurability Gap François Roewer-Després Jinyue Feng Zining Zhu Frank Rudzicz LRM 34 0 0 04 Jun 2024
An Overview Of Temporal Commonsense Reasoning and Acquisition Georg Wenzel Adam Jatowt ReLM LRM 16 8 0 28 Jul 2023
Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation Thinh Hung Truong Yulia Otmakhova Tim Baldwin Trevor Cohn Jey Han Lau Karin Verspoor 55 21 0 06 Oct 2022
Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios Mana Ashida Saku Sugawara 51 6 0 16 Sep 2022
Housekeep: Tidying Virtual Households using Commonsense Reasoning Yash Kant Arun Ramachandran Sriram Yenamandra Igor Gilitschenski Dhruv Batra Andrew Szot Harsh Agrawal LM&Ro LRM 152 70 0 22 May 2022
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning Jack Hessel Jena D. Hwang J. Park Rowan Zellers Chandra Bhagavatula Anna Rohrbach Kate Saenko Yejin Choi ReLM 145 48 0 10 Feb 2022
Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey Prajjwal Bhargava Vincent Ng ReLM LRM 32 61 0 28 Jan 2022
OPEn: An Open-ended Physics Environment for Learning Without a Task Chuang Gan Abhishek Bhandwaldar Antonio Torralba J. Tenenbaum Phillip Isola LRM 122 4 0 13 Oct 2021
NOPE: A Corpus of Naturally-Occurring Presuppositions in English Alicia Parrish Sebastian Schuster Alex Warstadt Omar Agha Soo-hwan Lee Zhuoye Zhao Sam Bowman Tal Linzen LRM 28 23 0 14 Sep 2021
Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning Da Yin Liunian Harold Li Ziniu Hu Nanyun Peng Kai-Wei Chang 83 52 0 14 Sep 2021
Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding Shane Storks Qiaozi Gao Yichi Zhang J. Chai ReLM LRM 34 22 0 10 Sep 2021
Baby Intuitions Benchmark (BIB): Discerning the goals, preferences, and actions of others Kanishk Gandhi Gala Stojnic Brenden Lake M. Dillon 44 46 0 23 Feb 2021
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge Bill Yuchen Lin Ziyi Wu Yichi Yang Dong-Ho Lee Xiang Ren ReLM LRM 233 62 0 02 Jan 2021
Temporal Reasoning on Implicit Events from Distant Supervision Ben Zhou Kyle Richardson Qiang Ning Tushar Khot Ashish Sabharwal Dan Roth 150 73 0 24 Oct 2020
Deriving Commonsense Inference Tasks from Interactive Fictions Mo Yu Xiaoxiao Guo Yufei Feng Xiao-Dan Zhu Michael A. Greenspan Murray Campbell ReLM LRM 16 2 0 19 Oct 2020
How Can We Accelerate Progress Towards Human-like Linguistic Generalization? Tal Linzen 216 188 0 03 May 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,927 0 20 Apr 2018