How to select an objective function using information theory

Abstract
In machine learning or scientific computing, model performance is measured with an objective function. But why choose one objective over another? Information theory gives one answer: To maximize the information in the model, select the most likely objective function or whichever represents the error in the fewest bits. To evaluate different objectives, transform them into likelihood functions. As likelihoods, their relative magnitudes represent how much we should prefer one objective versus another, and the log of their magnitude represents the expected uncertainty of the model.
View on arXivComments on this paper