207

Threshold Phenomena in Learning Halfspaces with Massart Noise

Abstract

We study the problem of PAC learning halfspaces on Rd\mathbb{R}^d with Massart noise under Gaussian marginals. In the Massart noise model, an adversary is allowed to flip the label of each point x\mathbf{x} with probability η(x)η\eta(\mathbf{x}) \leq \eta, for some parameter η[0,1/2]\eta \in [0,1/2]. The goal of the learner is to output a hypothesis with missclassification error opt+ϵ\mathrm{opt} + \epsilon, where opt\mathrm{opt} is the error of the target halfspace. Prior work studied this problem assuming that the target halfspace is homogeneous and that the parameter η\eta is strictly smaller than 1/21/2. We explore how the complexity of the problem changes when either of these assumptions is removed, establishing the following threshold phenomena: For η=1/2\eta = 1/2, we prove a lower bound of dΩ(log(1/ϵ))d^{\Omega (\log(1/\epsilon))} on the complexity of any Statistical Query (SQ) algorithm for the problem, which holds even for homogeneous halfspaces. On the positive side, we give a new learning algorithm for arbitrary halfspaces in this regime with sample complexity and running time Oϵ(1)dO(log(1/ϵ))O_\epsilon(1) \, d^{O(\log(1/\epsilon))}. For η<1/2\eta <1/2, we establish a lower bound of dΩ(log(1/γ))d^{\Omega(\log(1/\gamma))} on the SQ complexity of the problem, where γ=max{ϵ,min{Pr[f(x)=1],Pr[f(x)=1]}}\gamma = \max\{\epsilon, \min\{\mathbf{Pr}[f(\mathbf{x}) = 1], \mathbf{Pr}[f(\mathbf{x}) = -1]\} \} and ff is the target halfspace. In particular, this implies an SQ lower bound of dΩ(log(1/ϵ))d^{\Omega (\log(1/\epsilon) )} for learning arbitrary Massart halfspaces (even for small constant η\eta). We complement this lower bound with a new learning algorithm for this regime with sample complexity and runtime dOη(log(1/γ))poly(1/ϵ)d^{O_{\eta}(\log(1/\gamma))} \mathrm{poly}(1/\epsilon). Taken together, our results qualitatively characterize the complexity of learning halfspaces in the Massart model.

View on arXiv
Comments on this paper