ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.03292
35
0

FaR: Enhancing Multi-Concept Text-to-Image Diffusion via Concept Fusion and Localized Refinement

4 April 2025
Gia-Nghia Tran
Quang-Huy Che
Trong-Tai Dam Vu
Bich-Nga Pham
Vinh-Tiep Nguyen
Trung-Truc Huynh-Le
Minh-Triet Tran
    DiffM
ArXivPDFHTML
Abstract

Generating multiple new concepts remains a challenging problem in the text-to-image task. Current methods often overfit when trained on a small number of samples and struggle with attribute leakage, particularly for class-similar subjects (e.g., two specific dogs). In this paper, we introduce Fuse-and-Refine (FaR), a novel approach that tackles these challenges through two key contributions: Concept Fusion technique and Localized Refinement loss function. Concept Fusion systematically augments the training data by separating reference subjects from backgrounds and recombining them into composite images to increase diversity. This augmentation technique tackles the overfitting problem by mitigating the narrow distribution of the limited training samples. In addition, Localized Refinement loss function is introduced to preserve subject representative attributes by aligning each concept's attention map to its correct region. This approach effectively prevents attribute leakage by ensuring that the diffusion model distinguishes similar subjects without mixing their attention maps during the denoising process. By fine-tuning specific modules at the same time, FaR balances the learning of new concepts with the retention of previously learned knowledge. Empirical results show that FaR not only prevents overfitting and attribute leakage while maintaining photorealism, but also outperforms other state-of-the-art methods.

View on arXiv
@article{tran2025_2504.03292,
  title={ FaR: Enhancing Multi-Concept Text-to-Image Diffusion via Concept Fusion and Localized Refinement },
  author={ Gia-Nghia Tran and Quang-Huy Che and Trong-Tai Dam Vu and Bich-Nga Pham and Vinh-Tiep Nguyen and Trung-Nghia Le and Minh-Triet Tran },
  journal={arXiv preprint arXiv:2504.03292},
  year={ 2025 }
}
Comments on this paper