Collaborative LLM Numerical Reasoning with Local Data Protection

Numerical reasoning over documents, which demands both contextual understanding and logical inference, is challenging for low-capacity local models deployed on computation-constrained devices. Although such complex reasoning queries could be routed to powerful remote models like GPT-4, exposing local data raises significant data leakage concerns. Existing mitigation methods generate problem descriptions or examples for remote assistance. However, the inherent complexity of numerical reasoning hinders the local model from generating logically equivalent queries and accurately inferring answers with remote guidance. In this paper, we present a model collaboration framework with two key innovations: (1) a context-aware synthesis strategy that shifts the query domains while preserving logical consistency; and (2) a tool-based answer reconstruction approach that reuses the remote-generated problem-solving pattern with code snippets. Experimental results demonstrate that our method achieves better reasoning accuracy than solely using local models while providing stronger data protection than fully relying on remote models. Furthermore, our method improves accuracy by 16.2% - 43.6% while reducing data leakage by 2.3% - 44.6% compared to existing data protection approaches.
View on arXiv@article{zhang2025_2504.00299, title={ Collaborative LLM Numerical Reasoning with Local Data Protection }, author={ Min Zhang and Yuzhe Lu and Yun Zhou and Panpan Xu and Lin Lee Cheong and Chang-Tien Lu and Haozhu Wang }, journal={arXiv preprint arXiv:2504.00299}, year={ 2025 } }