172

Eigen-Stratified Models

Optimization and Engineering (Optim. Eng.), 2020
Abstract

Stratified models depend in an arbitrary way on a selected categorical feature that takes KK values, and depend linearly on the other nn features. Laplacian regularization with respect to a graph on the feature values can greatly improve the performance of a stratified model, especially in the low-data regime. A significant issue with Laplacian-regularized stratified models is that the model is KK times the size of the base model, which can be quite large. We address this issue by formulating eigen-stratifed models, which are stratified models with an additional constraint that the model parameters are linear combinations of some modest number mm of bottom eigenvectors of the graph Laplacian, i.e., those associated with the mm smallest eigenvalues. With eigen-stratified models, we only need to store the mm bottom eigenvectors and the corresponding coefficients as the stratified model parameters. This leads to a reduction, sometimes large, of model size when mnm \leq n and mKm \ll K. In some cases, the additional regularization implicit in eigen-stratified models can improve out-of-sample performance over standard Laplacian regularized stratified models.

View on arXiv
Comments on this paper