Addressing degeneracies in latent interpolation for diffusion models

There is an increasing interest in using image-generating diffusion models for deep data augmentation and image morphing. In this context, it is useful to interpolate between latents produced by inverting a set of input images, in order to generate new images representing some mixture of the inputs. We observe that such interpolation can easily lead to degenerate results when the number of inputs is large. We analyze the cause of this effect theoretically and experimentally, and suggest a suitable remedy. The suggested approach is a relatively simple normalization scheme that is easy to use whenever interpolation between latents is needed. We measure image quality using FID and CLIP embedding distance and show experimentally that baseline interpolation methods lead to a drop in quality metrics long before the degeneration issue is clearly visible. In contrast, our method significantly reduces the degeneration effect and leads to improved quality metrics also in non-degenerate situations.
View on arXiv@article{landolsi2025_2505.07481, title={ Addressing degeneracies in latent interpolation for diffusion models }, author={ Erik Landolsi and Fredrik Kahl }, journal={arXiv preprint arXiv:2505.07481}, year={ 2025 } }