72
26
v1v2 (latest)

Learning General Halfspaces with General Massart Noise under the Gaussian Distribution

Abstract

We study the problem of PAC learning halfspaces on Rd\mathbb{R}^d with Massart noise under the Gaussian distribution. In the Massart model, an adversary is allowed to flip the label of each point x\mathbf{x} with unknown probability η(x)η\eta(\mathbf{x}) \leq \eta, for some parameter η[0,1/2]\eta \in [0,1/2]. The goal is to find a hypothesis with misclassification error of OPT+ϵ\mathrm{OPT} + \epsilon, where OPT\mathrm{OPT} is the error of the target halfspace. This problem had been previously studied under two assumptions: (i) the target halfspace is homogeneous (i.e., the separating hyperplane goes through the origin), and (ii) the parameter η\eta is strictly smaller than 1/21/2. Prior to this work, no nontrivial bounds were known when either of these assumptions is removed. We study the general problem and establish the following: For η<1/2\eta <1/2, we give a learning algorithm for general halfspaces with sample and computational complexity dOη(log(1/γ))poly(1/ϵ)d^{O_{\eta}(\log(1/\gamma))}\mathrm{poly}(1/\epsilon), where γ=max{ϵ,min{Pr[f(x)=1],Pr[f(x)=1]}}\gamma =\max\{\epsilon, \min\{\mathbf{Pr}[f(\mathbf{x}) = 1], \mathbf{Pr}[f(\mathbf{x}) = -1]\} \} is the bias of the target halfspace ff. Prior efficient algorithms could only handle the special case of γ=1/2\gamma = 1/2. Interestingly, we establish a qualitatively matching lower bound of dΩ(log(1/γ))d^{\Omega(\log(1/\gamma))} on the complexity of any Statistical Query (SQ) algorithm. For η=1/2\eta = 1/2, we give a learning algorithm for general halfspaces with sample and computational complexity Oϵ(1)dO(log(1/ϵ))O_\epsilon(1) d^{O(\log(1/\epsilon))}. This result is new even for the subclass of homogeneous halfspaces; prior algorithms for homogeneous Massart halfspaces provide vacuous guarantees for η=1/2\eta=1/2. We complement our upper bound with a nearly-matching SQ lower bound of dΩ(log(1/ϵ))d^{\Omega(\log(1/\epsilon))}, which holds even for the special case of homogeneous halfspaces.

View on arXiv
Comments on this paper