Deconstructing Denoising Diffusion Models for Self-Supervised Learning

International Conference on Learning Representations (ICLR), 2024

25 January 2024

ArXiv (abs)PDF HTML HuggingFace (18 upvotes)

Abstract

In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoencoder (DAE). This deconstructive procedure allows us to explore how various components of modern DDMs influence self-supervised representation learning. We observe that only a very few modern components are critical for learning good representations, while many others are nonessential. Our study ultimately arrives at an approach that is highly simplified and to a large extent resembles a classical DAE. We hope our study will rekindle interest in a family of classical methods within the realm of modern self-supervised learning.

View on arXiv

Comments on this paper