We study the use of linear regression for multiclass classification in the over-parametrized regime where some of the training data is mislabeled. In such scenarios it is necessary to add an explicit regularization term, , for some convex function , to avoid overfitting the mislabeled data. In our analysis, we assume that the data is sampled from a Gaussian Mixture Model with equal class sizes, and that a proportion of the training labels is corrupted for each class. Under these assumptions, we prove that the best classification performance is achieved when and . We then proceed to analyze the classification errors for and in the large regime and notice that it is often possible to find sparse and one-bit solutions, respectively, that perform almost as well as the one corresponding to .
View on arXiv