16
157

Complete Dictionary Recovery over the Sphere I: Overview and the Geometric Picture

Abstract

We consider the problem of recovering a complete (i.e., square and invertible) matrix A0\mathbf A_0, from YRn×p\mathbf Y \in \mathbb{R}^{n \times p} with Y=A0X0\mathbf Y = \mathbf A_0 \mathbf X_0, provided X0\mathbf X_0 is sufficiently sparse. This recovery problem is central to theoretical understanding of dictionary learning, which seeks a sparse representation for a collection of input signals and finds numerous applications in modern signal processing and machine learning. We give the first efficient algorithm that provably recovers A0\mathbf A_0 when X0\mathbf X_0 has O(n)O(n) nonzeros per column, under suitable probability model for X0\mathbf X_0. In contrast, prior results based on efficient algorithms either only guarantee recovery when X0\mathbf X_0 has O(n)O(\sqrt{n}) zeros per column, or require multiple rounds of SDP relaxation to work when X0\mathbf X_0 has O(n1δ)O(n^{1-\delta}) nonzeros per column (for any constant δ(0,1)\delta \in (0, 1)). } Our algorithmic pipeline centers around solving a certain nonconvex optimization problem with a spherical constraint. In this paper, we provide a geometric characterization of the objective landscape. In particular, we show that the problem is highly structured: with high probability, (1) there are no "spurious" local minimizers; and (2) around all saddle points the objective has a negative directional curvature. This distinctive structure makes the problem amenable to efficient optimization algorithms. In a companion paper (arXiv:1511.04777), we design a second-order trust-region algorithm over the sphere that provably converges to a local minimizer from arbitrary initializations, despite the presence of saddle points.

View on arXiv
Comments on this paper