Data-dependent and Oracle Bounds on Forgetting in Continual Learning

Abstract
In continual learning, knowledge must be preserved and re-used between tasks, maintaining good transfer to future tasks and minimizing forgetting of previously learned ones. While several practical algorithms have been devised for this setting, there have been few theoretical works aiming to quantify and bound the degree of Forgetting in general settings. We provide both data-dependent and oracle upper bounds that apply regardless of model and algorithm choice, as well as bounds for Gibbs posteriors. We derive an algorithm based on our bounds and demonstrate empirically that our approach yields tight bounds on forgetting for several continual learning problems.
View on arXivComments on this paper