VolE: A Point-cloud Framework for Food 3D Reconstruction and Volume Estimation

Accurate food volume estimation is crucial for medical nutrition management and health monitoring applications, but current food volume estimation methods are often limited by mononuclear data, leveraging single-purpose hardware such as 3D scanners, gathering sensor-oriented information such as depth information, or relying on camera calibration using a reference object. In this paper, we present VolE, a novel framework that leverages mobile device-driven 3D reconstruction to estimate food volume. VolE captures images and camera locations in free motion to generate precise 3D models, thanks to AR-capable mobile devices. To achieve real-world measurement, VolE is a reference- and depth-free framework that leverages food video segmentation for food mask generation. We also introduce a new food dataset encompassing the challenging scenarios absent in the previous benchmarks. Our experiments demonstrate that VolE outperforms the existing volume estimation techniques across multiple datasets by achieving 2.22 % MAPE, highlighting its superior performance in food volume estimation.
View on arXiv@article{haroon2025_2505.10205, title={ VolE: A Point-cloud Framework for Food 3D Reconstruction and Volume Estimation }, author={ Umair Haroon and Ahmad AlMughrabi and Thanasis Zoumpekas and Ricardo Marques and Petia Radeva }, journal={arXiv preprint arXiv:2505.10205}, year={ 2025 } }