Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs

Accurate uncertainty quantification of large language models (LLMs) provides credibility measure over their outputs. However, fine-tuned LLMs often struggle with overconfidence in uncertain predictions due to the limitations in the models' ability to generalize with limited data. Existing parameter efficient fine-tuning (PEFT) uncertainty quantification methods for LLMs focus on post fine-tuning stage and fall short of calibrating epistemic uncertainty. To address these limitations, we propose Functional-Level Uncertainty Quantification for Calibrated Fine-Tuning (UQ4CT), which captures and calibrates epistemic uncertainty over the space of functions that map input prompts to outputs. We implement UQ4CT during the fine-tuning stage via a mixture-of-experts framework that hierarchically decomposes the functional space. We demonstrate that UQ4CT reduces Expected Calibration Error (ECE) by more than while maintaining high accuracy across benchmarks. Even under distribution shift, UQ4CT maintains superior ECE performance with high accuracy, showcasing improved generalizability.
View on arXiv@article{niu2025_2410.06431, title={ Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs }, author={ Ruijia Niu and Dongxia Wu and Rose Yu and Yi-An Ma }, journal={arXiv preprint arXiv:2410.06431}, year={ 2025 } }