Infinite wide (finite depth) Neural Networks benefit from multi-task learning unlike shallow Gaussian Processes -- an exact quantitative macroscopic characterization
- MLT

Abstract
We prove in this paper that wide ReLU neural networks (NNs) with at least one hidden layer optimized with l2-regularization on the parameters enforces multi-task learning due to representation-learning - also in the limit width to infinity. This is in contrast to multiple other idealized settings discussed in the literature where wide (ReLU)-NNs loose their ability to benefit from multi-task learning in the limit width to infinity. We deduce the multi-task learning ability from proving an exact quantitative macroscopic characterization of the learned NN in function space.
View on arXivComments on this paper