v1v2 (latest)

Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020

2 May 2020

Xiang Ren

Papers citing "Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models"

50 / 109 papers shown

Beyond Plain Demos: A Demo-centric Anchoring Paradigm for In-Context Learning in Alzheimer's Disease Detection

133

10 Nov 2025

Retrieval-Constrained Decoding Reveals Underestimated Parametric Knowledge in Language Models

Fragkiskos D. Malliaros

KELM

188

27 Sep 2025

Intermediate Languages Matter: Formal Languages and LLMs affect Neurosymbolic ReasoningInternational Conference on Semantic Systems (i-Semantics), 2025

211

04 Sep 2025

Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked?Annual Meeting of the Association for Computational Linguistics (ACL), 2025

257

21 Jul 2025

WinoWhat: A Parallel Corpus of Paraphrased WinoGrande Sentences with Common Sense Categorization

392

31 Mar 2025

Commonsense Reasoning in Arab CultureAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Abdelrahman Boda Sadallah

486

18 Feb 2025

Number Cookbook: Number Understanding of Language Models and How to Improve ItInternational Conference on Learning Representations (ICLR), 2024

608

06 Nov 2024

The Factuality of Large Language Models in the Legal DomainInternational Conference on Information and Knowledge Management (CIKM), 2024

Rajaa El Hamdani

Thomas Bonald

Fragkiskos D. Malliaros

Nils Holzenberger

Fabian M. Suchanek

AILaw HILM

381

18 Sep 2024

Towards a Generative Approach for Emotion Detection and Reasoning

Ankita Bhaumik

T. Strzalkowski

ReLM LRM

259

09 Aug 2024

Development of Cognitive Intelligence in Pre-trained Language Models

Raj Sanjay Shah

Khushi Bhardwaj

Sashank Varma

481

01 Jul 2024

Paraphrase Types Elicit Prompt Engineering Capabilities

582

28 Jun 2024

RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models

Yuqing Wang

Yun Zhao

LRM AAML ELM

311

16 Jun 2024

Are LLMs classical or nonmonotonic reasoners? Lessons from generics

362

05 Jun 2024

NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models

Ancheng Xu

Minghuan Tan

Lei Wang

Min Yang

Ruifeng Xu

LRM

215

05 Jun 2024

Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships

269

30 Apr 2024

Exploring Internal Numeracy in Language Models: A Case Study on ALBERT

Ulme Wennberg

G. Henter

MILM

361

25 Apr 2024

IndoCulture: Exploring Geographically-Influenced Cultural Commonsense Reasoning Across Eleven Indonesian ProvincesTransactions of the Association for Computational Linguistics (TACL), 2024

407

02 Apr 2024

Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?

Xianpei Han

Yaojie Lu

326

22 Feb 2024

EvoGrad: A Dynamic Take on the Winograd Schema Challenge with Human Adversaries

Jing Han Sun

Ali Emami

396

20 Feb 2024

Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Spyridon Mouselinos

Henryk Michalewski

Mateusz Malinowski

LRM

251

06 Feb 2024

Temporal Blind Spots in Large Language ModelsWeb Search and Data Mining (WSDM), 2024

Jonas Wallat

Adam Jatowt

Avishek Anand

447

22 Jan 2024

In-context Learning with Retrieved Demonstrations for Language Models: A Survey

860

21 Jan 2024

Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models

Yuqing Wang

Yun Zhao

VLM ReLM LRM

378

29 Dec 2023

Enhancing Quantitative Reasoning Skills of Large Language Models through Dimension PerceptionIEEE International Conference on Data Engineering (ICDE), 2023

Yuncheng Huang

Qi He

Jiaqing Liang

Sihang Jiang

Yanghua Xiao

Yunwen Chen

LRM

209

29 Dec 2023

CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models

250

20 Dec 2023

Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular DataConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Mubashara Akhtar

Abhilash Shankarampeta

Vivek Gupta

287

03 Nov 2023

ROME: Evaluating Pre-trained Vision-Language Models on Reasoning beyond Visual Common SenseConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

254

30 Oct 2023

CRoW: Benchmarking Commonsense Reasoning in Real-World TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

342

23 Oct 2023

GeoLLM: Extracting Geospatial Knowledge from Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

489

103

10 Oct 2023

Crystal: Introspective Reasoners Reinforced with Self-FeedbackConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yejin Choi

297

07 Oct 2023

Can NLP Models Ídentify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?

256

08 Sep 2023

TaskLAMA: Probing the Complex Task Understanding of Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023

258

29 Aug 2023

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge ConflictsInternational Conference on Learning Representations (ICLR), 2023

819

285

22 May 2023

The Web Can Be Your Oyster for Improving Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

460

18 May 2023

Human Behavioral Benchmarking: Numeric Magnitude Comparison Effects in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

414

18 May 2023

Completeness, Recall, and Negation in Open-World Knowledge Bases: A SurveyACM Computing Surveys (ACM Comput. Surv.), 2023

270

09 May 2023

Vera: A General-Purpose Plausibility Estimation Model for Commonsense StatementsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yejin Choi

341

05 May 2023

KitchenScale: Learning to predict ingredient quantities from recipe contextsExpert systems with applications (ESWA), 2023

193

21 Apr 2023

ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023

Xianpei Han

Yaojie Lu

372

29 Mar 2023

Language Model Behavior: A Comprehensive SurveyInternational Conference on Computational Logic (ICCL), 2023

Tyler A. Chang

Benjamin Bergen

VLM LRM LM&MA

536

157

20 Mar 2023

Can neural networks do arithmetic? A survey on the elementary numerical skills of state-of-the-art deep learning modelsApplied Sciences (Appl. Sci.), 2023

Alberto Testolin

AIMat

278

14 Mar 2023

The Life Cycle of Knowledge in Big Language Models: A SurveyMachine Intelligence Research (MIR), 2023

Xianpei Han

306

14 Mar 2023

Class Cardinality Comparison as a Fermi ProblemThe Web Conference (WWW), 2023

Tuan-Phong Nguyen

Simon Razniewski

Gerhard Weikum

164

08 Mar 2023

Complex QA and language models hybrid architectures, Survey

844

17 Feb 2023

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?Annual Meeting of the Association for Computational Linguistics (ACL), 2023

437

16 Feb 2023

Commonsense Reasoning for Conversational AI: A Survey of the State of the Art

Christopher Richardson

Larry Heck

LRM

304

15 Feb 2023

Benchmarks for Automated Commonsense Reasoning: A SurveyACM Computing Surveys (ACM Comput. Surv.), 2023

E. Davis

ELM LRM

459

09 Feb 2023

Understanding Finetuning for Factual Knowledge Extraction from Language Models

287

26 Jan 2023

A Survey of Deep Learning for Mathematical ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Wenhao Yu

401

193

20 Dec 2022

Analogical Math Word Problems Solving with Enhanced Problem-Solution AssociationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

230

01 Dec 2022