572

Universal Robust Regression via Maximum Mean Discrepancy

Biometrika (Biometrika), 2020
Abstract

Many datasets are collected automatically, and are thus easily contaminated by outliers. In order to overcome this issue there was recently a regain of interest in robust estimation. However, most robust estimation methods are designed for specific models. In regression, methods have been notably developed for estimating the regression coefficients in generalized linear models, while some other approaches have been proposed e.g.\ for robust inference in beta regression or in sample selection models. In this paper, we propose Maximum Mean Discrepancy optimization as a universal framework for robust regression. We prove non-asymptotic error bounds, showing that our estimators are robust to Huber-type contamination. We also provide a (stochastic) gradient algorithm for computing these estimators, whose implementation requires only to be able to sample from the model and to compute the gradient of its log-likelihood function. We finally illustrate the proposed approach by a set of simulations.

View on arXiv
Comments on this paper