Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise

We give tight statistical query (SQ) lower bounds for learnining halfspaces in the presence of Massart noise. In particular, suppose that all labels are corrupted with probability at most . We show that for arbitrary every SQ algorithm achieving misclassification error better than requires queries of superpolynomial accuracy or at least a superpolynomial number of queries. Further, this continues to hold even if the information-theoretically optimal error is as small as , where is the dimension and is an arbitrary absolute constant, and an overwhelming fraction of examples are noiseless. Our lower bound matches known polynomial time algorithms, which are also implementable in the SQ framework. Previously, such lower bounds only ruled out algorithms achieving error or error better than or, if is close to , error , where the term is constant in but going to 0 for approaching . As a consequence, we also show that achieving misclassification error better than in the -Tsybakov model is SQ-hard for constant and bounded away from 1.
View on arXiv