ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.15269
  4. Cited By
Testing the General Deductive Reasoning Capacity of Large Language
  Models Using OOD Examples

Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples

24 May 2023
Abulhair Saparov
Richard Yuanzhe Pang
Vishakh Padmakumar
Nitish Joshi
Seyed Mehran Kazemi
Najoung Kim
He He
    ELM
    LRM
ArXivPDFHTML

Papers citing "Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples"

25 / 25 papers shown
Title
The Promise and Limits of LLMs in Constructing Proofs and Hints for Logic Problems in Intelligent Tutoring Systems
The Promise and Limits of LLMs in Constructing Proofs and Hints for Logic Problems in Intelligent Tutoring Systems
Sutapa Dey Tithi
Arun Kumar Ramesh
Clara DiMarco
Xiaoyi Tian
Nazia Alam
Kimia Fazeli
Tiffany Barnes
LRM
22
0
0
07 May 2025
Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework
Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework
Yuan Xia
Akanksha Atrey
Fadoua Khmaissia
Kedar S. Namjoshi
LRM
ELM
45
0
0
28 Apr 2025
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
Simeng Sun
Cheng-Ping Hsieh
Faisal Ladhak
Erik Arakelyan
Santiago Akle Serano
Boris Ginsburg
ReLM
ELM
LRM
59
0
0
28 Mar 2025
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Zhanke Zhou
Zhaocheng Zhu
Xuan Li
Mikhail Galkin
Xiao Feng
Sanmi Koyejo
Jian Tang
Bo Han
LRM
44
0
0
28 Mar 2025
BIG-Bench Extra Hard
BIG-Bench Extra Hard
Mehran Kazemi
Bahare Fatemi
Hritik Bansal
John Palowitch
Chrysovalantis Anastasiou
...
Kate Olszewska
Yi Tay
Vinh Q. Tran
Quoc V. Le
Orhan Firat
ELM
LRM
117
4
0
26 Feb 2025
Unveiling Reasoning Thresholds in Language Models: Scaling, Fine-Tuning, and Interpretability through Attention Maps
Unveiling Reasoning Thresholds in Language Models: Scaling, Fine-Tuning, and Interpretability through Attention Maps
Yen-Che Hsiao
Abhishek Dutta
LRM
ReLM
ELM
54
0
0
24 Feb 2025
Quantifying Logical Consistency in Transformers via Query-Key Alignment
Quantifying Logical Consistency in Transformers via Query-Key Alignment
Eduard Tulchinskii
Anastasia Voznyuk
Laida Kushnareva
Andrei Andriiainen
Irina Piontkovskaya
Evgeny Burnaev
Serguei Barannikov
LRM
63
0
0
24 Feb 2025
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
54
1
0
17 Feb 2025
LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning
LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning
Tianshi Zheng
Jiayang Cheng
Chunyang Li
Haochen Shi
Z. Wang
Jiaxin Bai
Y. Song
Ginny Y. Wong
Simon See
LRM
46
2
0
16 Feb 2025
Logical forms complement probability in understanding language model (and human) performance
Logical forms complement probability in understanding language model (and human) performance
Yixuan Wang
Freda Shi
ReLM
LRM
71
2
0
13 Feb 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
78
4
0
31 Dec 2024
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina
Yuan Gao
Dokyun Lee
Gordon Burtch
Sina Fazelpour
LRM
42
7
0
25 Oct 2024
Evaluating the Instruction-following Abilities of Language Models using Knowledge Tasks
Evaluating the Instruction-following Abilities of Language Models using Knowledge Tasks
Rudra Murthy
Prince Kumar
Praveen Venkateswaran
Danish Contractor
KELM
ALM
ELM
26
1
0
16 Oct 2024
P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant
  Human-Written Reasoning Chains
P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Simeng Han
Aaron Yu
Rui Shen
Zhenting Qi
Martin Riddell
...
Yingbo Zhou
Caiming Xiong
Dragomir R. Radev
Rex Ying
Arman Cohan
LRM
33
2
0
11 Oct 2024
Teaching Transformers Causal Reasoning through Axiomatic Training
Teaching Transformers Causal Reasoning through Axiomatic Training
Aniket Vashishtha
Abhinav Kumar
Abbavaram Gowtham Reddy
Vineeth N. Balasubramanian
Amit Sharma
Vineeth N Balasubramanian
Amit Sharma
28
2
0
10 Jul 2024
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Xiaoshuai Song
Muxi Diao
Guanting Dong
Zhengyang Wang
Yujia Fu
...
Yejie Wang
Zhuoma Gongque
Jianing Yu
Qiuna Tan
Weiran Xu
ELM
47
10
0
12 Jun 2024
Examining LLMs' Uncertainty Expression Towards Questions Outside
  Parametric Knowledge
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
Genglin Liu
Xingyao Wang
Lifan Yuan
Yangyi Chen
Hao Peng
27
15
0
16 Nov 2023
GLoRE: Evaluating Logical Reasoning of Large Language Models
GLoRE: Evaluating Logical Reasoning of Large Language Models
Hanmeng Liu
Zhiyang Teng
Ruoxi Ning
Jian Liu
Qiji Zhou
Yuexin Zhang
Yue Zhang
ReLM
ELM
LRM
57
6
0
13 Oct 2023
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Jiatong Li
Rui Li
Qi Liu
21
14
0
08 Sep 2023
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of
  Chain-of-Thought
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Abulhair Saparov
He He
ELM
LRM
ReLM
116
270
0
03 Oct 2022
RobustLR: Evaluating Robustness to Logical Perturbation in Deductive
  Reasoning
RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning
Soumya Sanyal
Zeyi Liao
Xiang Ren
ELM
ReLM
LRM
59
19
0
25 May 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
On the Paradox of Learning to Reason from Data
On the Paradox of Learning to Reason from Data
Honghua Zhang
Liunian Harold Li
Tao Meng
Kai-Wei Chang
Guy Van den Broeck
NAI
ReLM
OOD
LRM
132
102
0
23 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
1