(Nearly) Optimal Private Linear Regression via Adaptive Clipping

We study the problem of differentially private linear regression where each data point is sampled from a fixed sub-Gaussian style distribution. We propose and analyze a one-pass mini-batch stochastic gradient descent method (DP-AMBSSGD) where points in each iteration are sampled without replacement. Noise is added for DP but the noise standard deviation is estimated online. Compared to existing -DP techniques which have sub-optimal error bounds, DP-AMBSSGD is able to provide nearly optimal error bounds in terms of key parameters like dimensionality , number of points , and the standard deviation of the noise in observations. For example, when the -dimensional covariates are sampled i.i.d. from the normal distribution, then the excess error of DP-AMBSSGD due to privacy is , i.e., the error is meaningful when number of samples which is the standard operative regime for linear regression. In contrast, error bounds for existing efficient methods in this setting are: , even for . That is, for constant , the existing techniques require to provide a non-trivial result.
View on arXiv