48
2

Minimax Subsampling for Estimation and Prediction in Low-Dimensional Linear Regression

Abstract

Subsampling strategies are derived to sample a small portion of design (data) points in a low-dimensional linear regression model y=Xβ+εy=X\beta+\varepsilon with near-optimal statistical rates. Our results apply to both problems of estimation of the underlying linear model β\beta and predicting the real-valued response yy of a new data point xx. The derived subsampling strategies are minimax optimal under the fixed design setting, up to a small (1+ϵ)(1+\epsilon) relative factor. We also give interpretable subsampling probabilities for the random design setting and demonstrate explicit gaps in statistial rates between optimal and baseline (e.g., uniform) subsampling methods.

View on arXiv
Comments on this paper