A faster and simpler algorithm for learning shallow networks

Abstract
We revisit the well-studied problem of learning a linear combination of ReLU activations given labeled examples drawn from the standard -dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this problem to run in time when , where is the target error. More precisely, their algorithm runs in time and learns over multiple stages. Here we show that a much simpler one-stage version of their algorithm suffices, and moreover its runtime is only .
View on arXivComments on this paper