Multivariate Convex Regression at Scale
We present new large-scale algorithms for fitting a subgradient regularized multivariate convex regression function to samples in dimensions -- a key problem in shape constrained nonparametric regression with widespread applications in statistics, engineering and the applied sciences. The infinite-dimensional learning task can be expressed via a convex quadratic program (QP) with decision variables and constraints. While instances with in the lower thousands can be addressed with current algorithms within reasonable runtimes, solving larger problems (e.g., or ) is computationally challenging. To this end, we present an active set type algorithm on the dual QP. For computational scalability, we perform approximate optimization of the reduced sub-problems; and propose randomized augmentation rules for expanding the active set. Although the dual is not strongly convex, we present a novel linear convergence rate of our algorithm on the dual. We demonstrate that our framework can approximately solve instances of the convex regression problem with and within minutes; and offers significant computational gains compared to earlier approaches.
View on arXiv