Polylogarithmic width suffices for gradient descent to achieve
  arbitrarily small test error with shallow ReLU networks

Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks

Papers citing "Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks"

46 / 46 papers shown
Title