ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.01584
  4. Cited By
ReasoningWeekly: A General Knowledge and Verbal Reasoning Challenge for Large Language Models
v1v2v3v4 (latest)

ReasoningWeekly: A General Knowledge and Verbal Reasoning Challenge for Large Language Models

3 February 2025
C. Anderson
Joydeep Biswas
Aleksander Boruch-Gruszecki
Federico Cassano
Molly Q. Feldman
Joydeep Biswas
Francesca Lucchetti
Zixuan Wu
    ReLMELMLRM
ArXiv (abs)PDFHTMLHuggingFace (10 upvotes)

Papers citing "ReasoningWeekly: A General Knowledge and Verbal Reasoning Challenge for Large Language Models"

16 / 16 papers shown
Title
NP-Engine: Empowering Optimization Reasoning in Large Language Models with Verifiable Synthetic NP Problems
NP-Engine: Empowering Optimization Reasoning in Large Language Models with Verifiable Synthetic NP Problems
Xiaozhe Li
Xinyu Fang
Shengyuan Ding
Linyang Li
Haodong Duan
Qingwen Liu
Kai Chen
OffRLLRM
72
0
0
18 Oct 2025
Lexical Hints of Accuracy in LLM Reasoning Chains
Lexical Hints of Accuracy in LLM Reasoning Chains
Arne Vanhoyweghen
Brecht Verbeken
Andres Algaba
Vincent Ginis
113
1
0
19 Aug 2025
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Jiangjie Chen
Qianyu He
Siyu Yuan
Aili Chen
Zhicheng Cai
...
Qiying Yu
Xuefeng Li
Jiaze Chen
Hao Zhou
Mingxuan Wang
ReLMLRM
305
21
0
26 May 2025
When Reasoning Beats Scale: A 1.5B Reasoning Model Outranks 13B LLMs as Discriminator
When Reasoning Beats Scale: A 1.5B Reasoning Model Outranks 13B LLMs as Discriminator
Md Fahim Anjum
LRM
294
1
0
30 Apr 2025
The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer
The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer
Marthe Ballon
Andres Algaba
Vincent Ginis
LRMReLM
257
32
0
24 Feb 2025
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Iman Mirzadeh
Keivan Alizadeh
Hooman Shahrokhi
Oncel Tuzel
Samy Bengio
Mehrdad Farajtabar
AIMatLRM
433
386
0
07 Oct 2024
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?
Nemika Tyagi
Mihir Parmar
Mohith Kulkarni
Aswin Rrv
Nisarg Patel
Mutsumi Nakamura
Arindam Mitra
Chitta Baral
LRM
222
18
0
20 Jul 2024
A Peek into Token Bias: Large Language Models Are Not Yet Genuine
  Reasoners
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
Bowen Jiang
Yangxinyu Xie
Zhuoqun Hao
Xiaomeng Wang
Tanwi Mallick
Weijie J. Su
Camillo J Taylor
Dan Roth
LRM
261
85
0
16 Jun 2024
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems
Qihuang Zhong
Kang Wang
Ziyang Xu
Juhua Liu
Liang Ding
Bo Du
LRMAIMat
397
5
0
23 Apr 2024
Missed Connections: Lateral Thinking Puzzles for Large Language Models
Missed Connections: Lateral Thinking Puzzles for Large Language Models
Graham Todd
Timothy Merino
Sam Earle
Julian Togelius
ReLMLRM
268
8
0
17 Apr 2024
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving
  Complex Mathematical Problems
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems
Bin Lei
LLMAGAI4CE
118
34
0
06 Apr 2024
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
David Rein
Betty Li Hou
Asa Cooper Stickland
Jackson Petty
Richard Yuanzhe Pang
Julien Dirani
Julian Michael
Samuel R. Bowman
AI4MHELM
373
1,553
0
20 Nov 2023
BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
BRAINTEASER: Lateral Thinking Puzzles for Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yifan Jiang
Filip Ilievski
Kaixin Ma
Zhivar Sourati
LRMReLM
269
14
0
08 Oct 2023
Solving and Generating NPR Sunday Puzzles with Large Language Models
Solving and Generating NPR Sunday Puzzles with Large Language ModelsInternational Conference on Innovative Computing and Cloud Computing (ICCC), 2023
Jin Zhao
Carolyn Jane Anderson
ReLMLRM
102
3
0
21 Jun 2023
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLMOffRLLRM
1.0K
6,547
0
27 Oct 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
Basel Alomair
Jacob Steinhardt
ReLMFaML
795
3,729
0
05 Mar 2021
1