Fast DPP Sampling for Nyström with Application to Kernel Methods

The Nystr\"om method has long been popular for scaling up kernel methods. However, successful use of Nystr\"om depends crucially on the selected landmarks. We consider landmark selection by using a Determinantal Point Process (DPP) to tractably select a diverse subset from the columns of an input kernel matrix. We prove that the landmarks selected using DPP sampling enjoy guaranteed error bounds; subsequently, we illustrate impact of DPP-sampled landmarks on kernel ridge regression. Moreover, we show how to efficiently sample from a DPP in linear time using a fast mixing (under certain constraints) Markov chain, which makes the overall procedure practical. Empirical results support our theoretical analysis: DPP-based landmark selection shows performance superior to existing approaches.
View on arXiv