168

Parametric Variational Linear Units (PVLUs) in Deep Convolutional Networks

Abstract

The Rectified Linear Unit is currently a state-of-the-art activation function in deep convolutional neural networks. To combat ReLU's dying neuron problem, we propose the Parametric Variational Linear Unit (PVLU), which adds a sinusoidal function with trainable coefficients to ReLU. Along with introducing nonlinearity and non-zero gradients across the entire real domain, PVLU allows for increased model generalization and robustness when implemented in the context of transfer learning. On a simple, non-transfer sequential CNN, PVLU allowed for a relative error decrease of 16.3% and 11.3% (without and with data augmentation) on CIFAR-10. PVLU is also tested on transfer learning problems. The VGG-16 and VGG-19 models experience relative error reductions of 9.5% and 10.7% on CIFAR-10, respectively, after the substitution of ReLU with PVLU. When training on Gaussian-filtered CIFAR-10 images, similar improvements are noted for the VGG models. Most notably, PVLU fine tuning allows for relative error reductions up to and exceeding 10% on near state-of-the-art ResNet models for both CIFAR-10 and CIFAR-100.

View on arXiv
Comments on this paper