Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.03439
Cited By
Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
7 April 2023
Hanmeng Liu
Ruoxi Ning
Zhiyang Teng
Jian Liu
Qiji Zhou
Yuexin Zhang
ELM
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4"
11 / 161 papers shown
Title
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning
Liangming Pan
Alon Albalak
Xinyi Wang
William Yang Wang
ReLM
LRM
AI4CE
54
236
0
20 May 2023
LogiCoT: Logical Chain-of-Thought Instruction-Tuning
Hanmeng Liu
Zhiyang Teng
Leyang Cui
Chaoli Zhang
Qiji Zhou
Yue Zhang
LRM
30
24
0
20 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
49
83
0
19 May 2023
RECKONING: Reasoning through Dynamic Knowledge Encoding
Zeming Chen
Gail Weiss
E. Mitchell
Asli Celikyilmaz
Antoine Bosselut
KELM
LRM
35
11
0
10 May 2023
Humans are Still Better than ChatGPT: Case of the IEEEXtreme Competition
Anis Koubaa
B. Qureshi
Adel Ammar
Zahid Khan
W. Boulila
L. Ghouti
ELM
ALM
30
22
0
10 May 2023
Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks
Xianzhi Li
Samuel Chan
Xiaodan Zhu
Yulong Pei
Zhiqiang Ma
Xiaomo Liu
Sameena Shah
AI4MH
38
76
0
10 May 2023
Professional Certification Benchmark Dataset: The First 500 Jobs For Large Language Models
David Noever
Matt Ciolino
ELM
50
4
0
07 May 2023
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Emre Kıcıman
Robert Osazuwa Ness
Amit Sharma
Chenhao Tan
LRM
ELM
32
261
0
28 Apr 2023
ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time
Shangqing Tu
Chunyang Li
Jifan Yu
Xiaozhi Wang
Lei Hou
Juanzi Li
LLMAG
AI4MH
75
10
0
27 Apr 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,142
0
24 May 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
6,996
0
20 Apr 2018
Previous
1
2
3
4