Deduction under Perturbed Evidence: Probing Student Simulation
Capabilities of Large Language Models

Deduction under Perturbed Evidence: Probing Student Simulation Capabilities of Large Language Models

23 May 2023

Shashank Sonkar

Richard G. Baraniuk

Papers citing "Deduction under Perturbed Evidence: Probing Student Simulation Capabilities of Large Language Models"

5 / 5 papers shown

Title
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions Naiming Liu Shashank Sonkar Zichao Wang Simon Woodhead Richard G. Baraniuk LRM AI4Ed 15 14 0 03 Oct 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4 Sébastien Bubeck Varun Chandrasekaran Ronen Eldan J. Gehrke Eric Horvitz ... Scott M. Lundberg Harsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang ELM AI4MH AI4CE ALM 248 2,232 0 22 Mar 2023
Language Models are Multilingual Chain-of-Thought Reasoners Freda Shi Mirac Suzgun Markus Freitag Xuezhi Wang Suraj Srivats ... Yi Tay Sebastian Ruder Denny Zhou Dipanjan Das Jason W. Wei ReLM LRM 170 324 0 06 Oct 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies Mor Geva Daniel Khashabi Elad Segal Tushar Khot Dan Roth Jonathan Berant RALM 245 671 0 06 Jan 2021