PRIMO: Private Regression in Multiple Outcomes

We introduce a new private regression setting we call Private Regression in Multiple Outcomes (PRIMO), inspired by the common situation where a data analyst wants to perform a set of regressions while preserving privacy, where the features are shared across all regressions, and each regression has a different vector of outcomes . Naively applying existing private linear regression techniques times leads to a multiplicative increase in error over the standard linear regression setting. We apply a variety of techniques including sufficient statistics perturbation (SSP) and geometric projection-based methods to develop scalable algorithms that outperform this baseline across a range of parameter regimes. In particular, we obtain no dependence on l in the asymptotic error when is sufficiently large. Empirically, on the task of genomic risk prediction with multiple phenotypes we find that even for values of far smaller than the theory would predict, our projection-based method improves the accuracy relative to the variant that doesn't use the projection.
View on arXiv