27
5

It Is Likely That Your Loss Should be a Likelihood

Mark Hamilton
Evan Shelhamer
William T. Freeman
Abstract

Many common loss functions such as mean-squared-error, cross-entropy, and reconstruction loss are unnecessarily rigid. Under a probabilistic interpretation, these common losses correspond to distributions with fixed shapes and scales. We instead argue for optimizing full likelihoods that include parameters like the normal variance and softmax temperature. Joint optimization of these "likelihood parameters" with model parameters can adaptively tune the scales and shapes of losses in addition to the strength of regularization. We explore and systematically evaluate how to parameterize and apply likelihood parameters for robust modeling, outlier-detection, and re-calibration. Additionally, we propose adaptively tuning L2L_2 and L1L_1 weights by fitting the scale parameters of normal and Laplace priors and introduce more flexible element-wise regularizers.

View on arXiv
Comments on this paper