"What cannot be measured cannot be improved" while likely never uttered by Lord Kelvin, summarizes effectively the purpose of this work. This paper presents a detailed evaluation of automated metrics for evaluating structured 3D reconstructions. Pitfalls of each metric are discussed, and a thorough analyses through the lens of expert 3D modelers' preferences is presented. A set of systematic "unit tests" are proposed to empirically verify desirable properties, and context aware recommendations as to which metric to use depending on application are provided. Finally, a learned metric distilled from human expert judgments is proposed and analyzed.
View on arXiv@article{langerman2025_2503.08208, title={ Explaining Human Preferences via Metrics for Structured 3D Reconstruction }, author={ Jack Langerman and Denys Rozumnyi and Yuzhong Huang and Dmytro Mishkin }, journal={arXiv preprint arXiv:2503.08208}, year={ 2025 } }