Guaranteeing Reproducibility in Deep Learning Competitions
Brandon Houghton
Stephanie Milani
Nicholay Topin
William H. Guss
Katja Hofmann
Diego Perez-Liebana
Manuela Veloso
Ruslan Salakhutdinov

Abstract
To encourage the development of methods with reproducible and robust training behavior, we propose a challenge paradigm where competitors are evaluated directly on the performance of their learning procedures rather than pre-trained agents. Since competition organizers re-train proposed methods in a controlled setting they can guarantee reproducibility, and -- by retraining submissions using a held-out test set -- help ensure generalization past the environments on which they were trained.
View on arXivComments on this paper