277

Layer Flexible Adaptive Computational Time for Recurrent Neural Networks

Abstract

Deep recurrent neural networks perform well on sequence data and are the model of choice. It is a daunting task to decide the number of layers, especially considering different computational needs for tasks within a sequence of different difficulties. We propose a layer flexible recurrent neural network with adaptive computational time, and expand it to a sequence to sequence model. Contrary to the adaptive computational time model, our model has a dynamic number of transmission states which vary by step and sequence. We evaluate the model on a financial data set and Wikipedia language modeling. Experimental results show the performance improvement of 2% to 3% and indicate the model's ability to dynamically change the number of layers.

View on arXiv
Comments on this paper