Clarifying Before Reasoning: A Coq Prover with Structural Context

3 July 2025

Yanzhen Lu

Hanbin Yang

Xiaodie Wang

Ge Zhang

Biao Li

Chenxu Fu

Chao Li

Yang Yuan

Andrew Chi-Chih Yao

ArXiv (abs)PDF HTML

Main:18 Pages

1 Figures

Bibliography:3 Pages

8 Tables

Appendix:5 Pages

Abstract

In this work, we investigate whether improving task clarity can enhance reasoning ability of large language models, focusing on theorem proving in Coq. We introduce a concept-level metric to evaluate task clarity and show that adding structured semantic context to the standard input used by modern LLMs, leads to a 1.85 $\times$ improvement in clarity score (44.5\%~ $\rightarrow$ ~82.3\%). Using the general-purpose model \texttt{DeepSeek-V3}, our approach leads to a 2.1 $\times$ improvement in proof success (21.8\%~ $\rightarrow$ ~45.8\%) and outperforms the previous state-of-the-art \texttt{Graph2Tac} (33.2\%). We evaluate this on 1,386 theorems randomly sampled from 15 standard Coq packages, following the same evaluation protocol as \texttt{Graph2Tac}. Furthermore, fine-tuning smaller models on our structured data can achieve even higher performance (48.6\%). Our method uses selective concept unfolding to enrich task descriptions, and employs a Planner--Executor architecture. These findings highlight the value of structured task representations in bridging the gap between understanding and reasoning.

View on arXiv

@article{lu2025_2507.02541,
  title={ Clarifying Before Reasoning: A Coq Prover with Structural Context },
  author={ Yanzhen Lu and Hanbin Yang and Xiaodie Wang and Ge Zhang and Biao Li and Chenxu Fu and Chao Li and Yang Yuan and Andrew Chi-Chih Yao },
  journal={arXiv preprint arXiv:2507.02541},
  year={ 2025 }
}

Comments on this paper