Natural Language Edge Labelling: Decoupling Intent from Execution in Structured LM Reasoning

6 October 2025

Abhinav Madahar

ArXiv (abs)PDF HTML Github

Main:11 Pages

1 Figures

Bibliography:1 Pages

3 Tables

Abstract

Controllers for structured LM reasoning (e.g., Chain-of-Thought, self-consistency, and Tree-of-Thoughts) often entangle what to try next with how to execute it, exposing only coarse global knobs and yielding brittle, compute-inefficient, and hard-to-audit behavior. We introduce Natural Language Edge Labelling (NLEL), a labeller-tuner overlay that attaches a free-form natural-language directive to each search edge and translates it into a schema-bounded control vector for decoding, search (branch quotas, exploration $\beta$ ), generation bundle size, retrieval mixtures, and verification passes. A labeller $\Lambda$ emits labels from the parent state and a compact context; a tuner $\Psi$ maps $(P, L, C)\to \Pi$ , with strict schema validation and trust-region projection around safe defaults. Downstream selection remains ToT-style with score $S=\mu+\beta\sigma$ and depth-annealed $\beta$ . We show NLEL strictly generalizes CoT/ToT, prove an anytime-monotonicity property for top- $k$ selection under label-conditioned bundles, and bound selector shortfall by control-vector distortion, providing decision-relevant justification for guards like trust regions and verification passes. We instantiate $\Psi$ as a prompt-only JSON Parameter Emitter and preregister an evaluation on GSM8K, MATH (subset), StrategyQA, and ARC-Challenge with compute-aware reporting (success@compute, tokens-per-success) and ablations over $\Lambda$ , $\Psi$ , trust-region radius, and control quantization; preregistered forecasts anticipate accuracy gains at comparable token budgets and improved success@compute under constraints. NLEL offers an interpretable, model-agnostic interface that separates intent from execution for controllable, auditable LM inference.

View on arXiv

Comments on this paper