ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.14066
22
7

CreativeSynth: Cross-Art-Attention for Artistic Image Synthesis with Multimodal Diffusion

25 January 2024
Nisha Huang
Weiming Dong
Yuxin Zhang
Fan Tang
Ronghui Li
Chongyang Ma
Xiu Li
Tong-Yee Lee
Changsheng Xu
    DiffM
ArXivPDFHTML
Abstract

Although remarkable progress has been made in image style transfer, style is just one of the components of artistic paintings. Directly transferring extracted style features to natural images often results in outputs with obvious synthetic traces. This is because key painting attributes including layout, perspective, shape, and semantics often cannot be conveyed and expressed through style transfer. Large-scale pretrained text-to-image generation models have demonstrated their capability to synthesize a vast amount of high-quality images. However, even with extensive textual descriptions, it is challenging to fully express the unique visual properties and details of paintings. Moreover, generic models often disrupt the overall artistic effect when modifying specific areas, making it more complicated to achieve a unified aesthetic in artworks. Our main novel idea is to integrate multimodal semantic information as a synthesis guide into artworks, rather than transferring style to the real world. We also aim to reduce the disruption to the harmony of artworks while simplifying the guidance conditions. Specifically, we propose an innovative multi-task unified framework called CreativeSynth, based on the diffusion model with the ability to coordinate multimodal inputs. CreativeSynth combines multimodal features with customized attention mechanisms to seamlessly integrate real-world semantic content into the art domain through Cross-Art-Attention for aesthetic maintenance and semantic fusion. We demonstrate the results of our method across a wide range of different art categories, proving that CreativeSynth bridges the gap between generative models and artistic expression. Code and results are available atthis https URL.

View on arXiv
@article{huang2025_2401.14066,
  title={ CreativeSynth: Cross-Art-Attention for Artistic Image Synthesis with Multimodal Diffusion },
  author={ Nisha Huang and Weiming Dong and Yuxin Zhang and Fan Tang and Ronghui Li and Chongyang Ma and Xiu Li and Tong-Yee Lee and Changsheng Xu },
  journal={arXiv preprint arXiv:2401.14066},
  year={ 2025 }
}
Comments on this paper