Efficient active learning of sparse halfspaces with arbitrary bounded noise

We study active learning of homogeneous -sparse halfspaces in under label noise. Even in the presence of mild label noise this is a challenging problem and only recently have label complexity bounds of the form been established in \cite{zhang2018efficient} for computationally efficient algorithms under the broad class of isotropic log-concave distributions. In contrast, under high levels of label noise, the label complexity bounds achieved by computationally efficient algorithms are much worse. When the label noise satisfies the {\em Massart} condition \cite{massart2006risk}, i.e., each label is flipped with probability at most for a parameter , state-of-the-art result \cite{awasthi2016learning} provides a computationally efficient active learning algorithm under isotropic log-concave distributions with label complexity , which is label-efficient only when the noise rate is a constant. In this work, we substantially improve on it by designing a polynomial time algorithm for active learning of -sparse halfspaces under bounded noise and isotropic log-concave distributions, with a label complexity of . This is the first efficient algorithm with label complexity polynomial in in this setting, which is label-efficient even for arbitrarily close to . Our guarantees also immediately translate to new state-of-the-art label complexity results for full-dimensional active and passive halfspace learning under arbitrary bounded noise and isotropic log-concave distributions.
View on arXiv