MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis

1 July 2025

Jianhao Xie

Ziang Zhang

Zhenyu Weng

Yuesheng Zhu

Guibo Luo

MedIm

ArXiv (abs)PDF HTML

Main:8 Pages

3 Figures

Bibliography:3 Pages

4 Tables

Abstract

Recent advancements in deep learning for medical image segmentation are often limited by the scarcity of high-quality trainingthis http URLdiffusion models provide a potential solution by generating synthetic images, their effectiveness in medical imaging remains constrained due to their reliance on large-scale medical datasets and the need for higher image quality. To address these challenges, we present MedDiff-FT, a controllable medical image generation method that fine-tunes a diffusion foundation model to produce medical images with structural dependency and domain specificity in a data-efficient manner. During inference, a dynamic adaptive guiding mask enforces spatial constraints to ensure anatomically coherent synthesis, while a lightweight stochastic mask generator enhances diversity through hierarchical randomness injection. Additionally, an automated quality assessment protocol filters suboptimal outputs using feature-space metrics, followed by mask corrosion to refine fidelity. Evaluated on five medical segmentation datasets,MedDiff-FT's synthetic image-mask pairs improve SOTA method's segmentation performance by an average of 1% in Dice score. The framework effectively balances generation quality, diversity, and computational efficiency, offering a practical solution for medical data augmentation. The code is available atthis https URL.

View on arXiv

@article{xie2025_2507.00377,
  title={ MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis },
  author={ Jianhao Xie and Ziang Zhang and Zhenyu Weng and Yuesheng Zhu and Guibo Luo },
  journal={arXiv preprint arXiv:2507.00377},
  year={ 2025 }
}

Comments on this paper