A Pseudo Knockoff Filter for Correlated Features

In 2015, Barber and Candes introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and prove that this method achieves exact FDR control. Inspired by the work of Barber and Candes (2015), we propose and analyze a pseudo-knockoff filter that inherits some advantages of the original knockoff filter and has more flexibility in constructing its knockoff matrix. Although we have not been able to obtain exact FDR control of the pseudo knockoff filter, we show that it satisfies an expectation inequality that offers some insight into FDR control. Moreover, we provide some partial analysis of the pseudo knockoff filter for the half Lasso and the least squares statistics. Our analysis indicates that the inverse of the covariance matrix of the feature matrix plays an important role in designing and analyzing the pseudo knockoff filter. Our preliminary numerical experiments show that the pseudo knockoff filter with the half Lasso statistic has FDR control. Moreover, our numerical experiments show that the pseudo-knockoff filter could offer more power than the original knockoff filter with the OMP or Lasso Path statistic when the features are correlated and non-sparse.
View on arXiv