Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
- LRM
Main:9 Pages
28 Figures
Bibliography:6 Pages
12 Tables
Appendix:24 Pages
Abstract
Chain-of-thought prompting (CoT) has the potential to improve the explainability of language model reasoning. But CoT can also systematically misrepresent the factors influencing models' behavior -- for example, rationalizing answers in line with a user's opinion.
View on arXivComments on this paper
