93
2

Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets

Abstract

While one commonly trains large diffusion models by collecting datasets on target downstream tasks, it is often desired to align and finetune pretrained diffusion models with some reward functions that are either designed by experts or learned from small-scale datasets. Existing post-training methods for reward finetuning of diffusion models typically suffer from lack of diversity in generated samples, lack of prior preservation, and/or slow convergence in finetuning. In response to this challenge, we take inspiration from recent successes in generative flow networks (GFlowNets) and propose a reinforcement learning method for diffusion model finetuning, dubbed Nabla-GFlowNet (abbreviated as \nabla-GFlowNet), that leverages the rich signal in reward gradients for probabilistic diffusion finetuning. We show that our proposed method achieves fast yet diversity- and prior-preserving finetuning of Stable Diffusion, a large-scale text-conditioned image diffusion model, on different realistic reward functions.

View on arXiv
@article{liu2025_2412.07775,
  title={ Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets },
  author={ Zhen Liu and Tim Z. Xiao and Weiyang Liu and Yoshua Bengio and Dinghuai Zhang },
  journal={arXiv preprint arXiv:2412.07775},
  year={ 2025 }
}
Comments on this paper