348

Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought

Main:9 Pages
28 Figures
Bibliography:6 Pages
12 Tables
Appendix:24 Pages
Abstract

Chain-of-thought prompting (CoT) has the potential to improve the explainability of language model reasoning. But CoT can also systematically misrepresent the factors influencing models' behavior -- for example, rationalizing answers in line with a user's opinion.

View on arXiv
Comments on this paper