Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.15522
Cited By
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
23 April 2024
Mihir Parmar
Nisarg Patel
Neeraj Varshney
Mutsumi Nakamura
Man Luo
Santosh Mashetty
Arindam Mitra
Chitta Baral
LRM
ReLM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models"
23 / 23 papers shown
Title
TRAVELER: A Benchmark for Evaluating Temporal Reasoning across Vague, Implicit and Explicit References
Svenja Kenneweg
J. Deigmöller
Philipp Cimiano
Julian Eggert
40
0
0
02 May 2025
Reasoning Capabilities and Invariability of Large Language Models
Alessandro Raganato
Rafael Peñaloza
Marco Viviani
G. Pasi
ReLM
LRM
77
0
0
01 May 2025
Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework
Yuan Xia
Akanksha Atrey
Fadoua Khmaissia
Kedar S. Namjoshi
LRM
ELM
37
0
0
28 Apr 2025
Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability
Daniel Hendriks
Philipp Spitzer
Niklas Kühl
G. Satzger
22
0
0
22 Apr 2025
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning
Atharva Pandey
Kshitij Dubey
Rahul Sharma
Amit Sharma
ReLM
ELM
LRM
52
0
0
09 Apr 2025
CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis
Anjiang Wei
Tarun Suresh
Jiannan Cao
Naveen Kannan
Yuheng Wu
Kai Yan
Thiago S. F. X. Teixeira
Ke Wang
Alex Aiken
ELM
LRM
34
0
0
29 Mar 2025
MastermindEval: A Simple But Scalable Reasoning Benchmark
Jonas Golde
Patrick Haller
Fabio Barth
Alan Akbik
LRM
ReLM
ELM
46
1
0
07 Mar 2025
BIG-Bench Extra Hard
Mehran Kazemi
Bahare Fatemi
Hritik Bansal
John Palowitch
Chrysovalantis Anastasiou
...
Kate Olszewska
Yi Tay
Vinh Q. Tran
Quoc V. Le
Orhan Firat
ELM
LRM
115
4
0
26 Feb 2025
InductionBench: LLMs Fail in the Simplest Complexity Class
Wenyue Hua
Tyler Wong
Sun Fei
Liangming Pan
Adam Jardine
William Yang Wang
LRM
63
2
0
20 Feb 2025
On Memorization of Large Language Models in Logical Reasoning
Chulin Xie
Yangsibo Huang
Chiyuan Zhang
Da Yu
Xinyun Chen
Bill Yuchen Lin
Bo Li
Badih Ghazi
Ravi Kumar
LRM
41
20
0
30 Oct 2024
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
Fangru Lin
Shaoguang Mao
Emanuele La Malfa
Valentin Hofmann
Adrian de Wynter
Jing Yao
Si-Qing Chen
Michael Wooldridge
Furu Wei
Furu Wei
40
2
0
14 Oct 2024
Reasoning Elicitation in Language Models via Counterfactual Feedback
Alihan Hüyük
Xinnuo Xu
Jacqueline Maasch
Aditya V. Nori
Javier González
ReLM
LRM
47
1
0
02 Oct 2024
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models
Man Luo
Shrinidhi Kumbhar
Ming shen
Mihir Parmar
Neeraj Varshney
Pratyay Banerjee
Somak Aditya
Chitta Baral
ReLM
ELM
LRM
37
23
0
02 Oct 2023
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Abulhair Saparov
He He
ELM
LRM
ReLM
116
270
0
03 Oct 2022
BioTABQA: Instruction Learning for Biomedical Table Question Answering
Man Luo
S. Saxena
Swaroop Mishra
Mihir Parmar
Chitta Baral
LMTD
135
15
0
06 Jul 2022
Is a Question Decomposition Unit All We Need?
Pruthvi H. Patel
Swaroop Mishra
Mihir Parmar
Chitta Baral
ReLM
130
50
0
25 May 2022
On the Paradox of Learning to Reason from Data
Honghua Zhang
Liunian Harold Li
Tao Meng
Kai-Wei Chang
Guy Van den Broeck
NAI
ReLM
OOD
LRM
132
102
0
23 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
RuleBert: Teaching Soft Rules to Pre-trained Language Models
Mohammed Saeed
N. Ahmadi
Preslav Nakov
Paolo Papotti
LRM
232
31
0
24 Sep 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
274
882
0
18 Apr 2021
Learning to Generate Task-Specific Adapters from Task Description
Qinyuan Ye
Xiang Ren
99
29
0
02 Jan 2021
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
248
1,382
0
21 Jan 2020
1