We show hardness of improperly learning halfspaces in the agnostic model, both in the distribution-independent as well as the distribution-specific setting, based on the assumption that worst-case lattice problems, such as GapSVP or SIVP, are hard. In particular, we show that under this assumption there is no efficient algorithm that outputs any binary hypothesis, not necessarily a halfspace, achieving misclassfication error better than even if the optimal misclassification error is as small is as small as . Here, can be smaller than the inverse of any polynomial in the dimension and as small as , where is an arbitrary constant and is the dimension. For the distribution-specific setting, we show that if the marginal distribution is standard Gaussian, for any learning halfspaces up to error takes time at least under the same hardness assumptions. Similarly, we show that learning degree- polynomial threshold functions up to error takes time at least . and denote the best error achievable by any halfspace or polynomial threshold function, respectively. Our lower bounds qualitively match algorithmic guarantees and (nearly) recover known lower bounds based on non-worst-case assumptions. Previously, such hardness results [Daniely16, DKPZ21] were based on average-case complexity assumptions or restricted to the statistical query model. Our work gives the first hardness results basing these fundamental learning problems on worst-case complexity assumptions. It is inspired by a sequence of recent works showing hardness of learning well-separated Gaussian mixtures based on worst-case lattice problems.
View on arXiv