Non-parametric machine learning models, such as random forests and gradient boosted trees, are frequently used to estimate house prices due to their predictive accuracy, but a main drawback of such methods is their limited ability to quantify prediction uncertainty. Conformal prediction (CP) is a model-agnostic framework for constructing confidence sets around predictions of machine learning models with minimal assumptions. However, due to the spatial dependencies observed in house prices, direct application of CP leads to confidence sets that are not calibrated everywhere, i.e., the confidence sets will be too large in certain geographical regions and too small in others. We survey various approaches to adjust the CP confidence set to account for this and demonstrate their performance on a data set from the housing market in Oslo, Norway. Our findings indicate that calibrating the confidence sets on a spatially weighted version of the non-conformity scores makes the coverage more consistently calibrated across geographical regions. We also perform a simulation study on synthetically generated sale prices to empirically explore the performance of CP on housing market data under idealized conditions with known data-generating mechanisms.
View on arXiv