Toward Mechanistic Explanation of Deductive Reasoning in Language Models
- ReLMLRMELMAI4CE

Main:12 Pages
10 Figures
Bibliography:3 Pages
1 Tables
Abstract
Recent large language models have demonstrated relevant capabilities in solving problems that require logical reasoning; however, the corresponding internal mechanisms remain largely unexplored. In this paper, we show that a small language model can solve a deductive reasoning task by learning the underlying rules (rather than operating as a statistical learner). A low-level explanation of its internal representations and computational circuits is then provided. Our findings reveal that induction heads play a central role in the implementation of the rule completion and rule chaining steps involved in the logical inference required by the task.
View on arXivComments on this paper
