204

Meta-Consolidation for Continual Learning

Neural Information Processing Systems (NeurIPS), 2020
Abstract

The ability to continuously learn and adapt itself to new tasks, without losing grasp of already acquired knowledge is a hallmark of biological learning systems, which current deep learning systems fall short of. In this work, we present a novel methodology for continual learning called MERLIN: Meta-Consolidation for Continual Learning. We assume that weights of a neural network ψ\boldsymbol \psi, for solving task t\boldsymbol t, come from a meta-distribution p(ψt)p(\boldsymbol{\psi|t}). This meta-distribution is learned and consolidated incrementally. We operate in the challenging online continual learning setting, where a data point is seen by the model only once. Our experiments with continual learning benchmarks of MNIST, CIFAR-10, CIFAR-100 and Mini-ImageNet datasets show consistent improvement over five baselines, including a recent state-of-the-art, corroborating the promise of MERLIN.

View on arXiv
Comments on this paper