707

Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

Main:13 Pages
17 Figures
Bibliography:8 Pages
10 Tables
Appendix:21 Pages
Abstract

This work addresses the critical question of why and when diffusion models, despite being designed for generative tasks, can excel at learning high-quality representations in a self-supervised manner. To address this, we develop a mathematical framework based on a low-dimensional data model and posterior estimation, revealing a fundamental trade-off between generation and representation quality near the final stage of image generation. Our analysis explains the unimodal representation dynamics across noise scales, mainly driven by the interplay between data denoising and class specification. Building on these insights, we propose an ensemble method that aggregates features across noise levels, significantly improving both clean performance and robustness under label noise. Extensive experiments on both synthetic and real-world datasets validate our findings.

View on arXiv
Comments on this paper