v1v2 (latest)

Distilling Feedback into Memory-as-a-Tool

9 January 2026

Víctor Gallego

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (1★)

Main:6 Pages

5 Figures

Bibliography:1 Pages

5 Tables

Appendix:8 Pages

Abstract

We propose a framework that amortizes the cost of inference-time reasoning by converting transient critiques into retrievable guidelines, through a file-based memory system and agent-controlled tool calls. We evaluate this method on the Rubric Feedback Bench, a novel dataset for rubric-based learning. Experiments demonstrate that our augmented LLMs rapidly match the performance of test-time refinement pipelines while drastically reducing inference cost.

View on arXiv

Comments on this paper