CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning

We introduce CLEAR (Contrasting Textual Feedback with Experts and Amateurs for Reasoning), a novel approach to language model reasoning that leverages the strengths of a larger (expert) model and smaller (amateur) model. The expert and amateur models each provide feedback on a model's initial output and are contrasted with each other into refined feedback. This feedback is subsequently applied to iteratively improve CLEAR's responses. Our experiments demonstrate that CLEAR outperforms state-of-the-art methods in several challenging reasoning tasks, including story outline improvement (up to 19.6% relative increase in interestingness), constrained generation (up to 18.5% increase in coverage), mathematical reasoning (up to 6.7% improvement in accuracy) and mitigation of toxicity (decrease of up to 22% in toxicity).
View on arXiv@article{rufail2025_2504.07116, title={ CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning }, author={ Andrew Rufail and Daniel Kim and Sean O'Brien and Kevin Zhu }, journal={arXiv preprint arXiv:2504.07116}, year={ 2025 } }