A Study of Neural Training with Iterative Non-Gradient Methods

8 May 2020

Ramchandran Muthukumar

Abstract

In this work we demonstrate provable guarantees on the training of depth-2 neural networks in new regimes than previously explored. (1) First we give a simple stochastic algorithm that can train a ReLU gate in the realizable setting in linear time while using significantly milder conditions on the data distribution than previous results. Leveraging some additional distributional assumptions we also show approximate recovery of the true label generating parameters when training a ReLU gate while a probabilistic adversary is allowed to corrupt the true labels of the training data. Our guarantee on recovering the true weight degrades gracefully with increasing probability of attack and its nearly optimal in the worst case. Additionally our analysis allows for mini-batching and computes how the convergence time scales with the mini-batch size. (2) Secondly, we exhibit a non-gradient iterative algorithm "Neuro-Tron" which gives a first-of-its-kind poly-time approximate solving of a neural regression (here in the infinity-norm) problem at finite net widths and for non-realizable data.

View on arXiv

Comments on this paper