Optimal and computationally tractable lower bounds for logistic log-likelihoods
The logit transform is arguably the most widely-employed link function beyond linear settings. This transformation routinely appears in regression models for binary data and provides a central building-block in popular methods for both classification and regression. Its widespread use, combined with the lack of analytical solutions for the optimization of objective functions involving the logit transform, still motivates active research in computational statistics. Among the directions explored, a central one has focused on the design of tangent lower bounds for logistic log-likelihoods that can be tractably optimized, while providing a tight approximation of these log-likelihoods. This has led to the development of effective minorize-maximize (MM) algorithms for point estimation, and variational schemes for approximate Bayesian inference under several logit models. However, the overarching focus has been on tangent quadratic minorizers. In fact, it is still unclear whether tangent lower bounds sharper than quadratic ones can be derived without undermining the tractability of the resulting minorizer. This article addresses such a question through the design and study of a novel piece-wise quadratic lower bound that uniformly improves any tangent quadratic minorizer, including the sharpest ones, while admitting a direct interpretation in terms of the classical generalized lasso problem. As illustrated in realistic empirical studies, such a sharper bound not only improves the speed of convergence of common MM schemes for penalized maximum likelihood estimation, but also yields tractable variational Bayes (VB) approximations with higher accuracy relative to those obtained under popular quadratic bounds employed in VB.
View on arXiv