Deep learning models can tease out information from complex inputs. The richer inputs the better these models usually perform. However, models that leverage rich inputs (e.g. multi-sensor, multi-modality, multi-view) can be difficult to deployed widely because some inputs may be missing during deployment. Current popular solutions to this problem includes marginalization, imputation, and training multiple models. Marginalization can obtain calibrated predictions but it is computationally costly and therefore is only feasible for low dimensional inputs. Imputation may result in mis-calibrated predictions because it approximates predictions using point estimates and does not work for high dimensional inputs (e.g. images). Training multiple models whereby each models take different subsets of inputs can work well but requires knowing missing input patterns in advance. Furthermore, training multiple models is costly when models are built on top of foundational models. We propose an efficient way to learn both the conditional distribution using full inputs and the marginal distributions using partial inputs simultaneously using a single model and input mask-out. Input mask-out ensures that learning the marginal distributions does not interfere with learning the conditional distribution. Our approach is general and can be applied to both low- and high-dimensional inputs. We evaluate mask-out in several simulations to show that it can help a single model efficiently learns both conditional and marginal distributions. Experiment results multiple real-world datasets in both classification and segmentation demonstrates the utility of mask-out.

View on arXiv

Comments on this paper