AnyArtisticGlyph: Multilingual Controllable Artistic Glyph Generation

Artistic Glyph Image Generation (AGIG) differs from current creativity-focused generation models by offering finely controllable deterministic generation. It transfers the style of a reference image to a source while preserving its content. Although advanced and promising, current methods may reveal flaws when scrutinizing synthesized image details, often producing blurred or incorrect textures, posing a significant challenge. Hence, we introduce AnyArtisticGlyph, a diffusion-based, multilingual controllable artistic glyph generation model. It includes a font fusion and embedding module, which generates latent features for detailed structure creation, and a vision-text fusion and embedding module that uses the CLIP model to encode references and blends them with transformation caption embeddings for seamless global image generation. Moreover, we incorporate a coarse-grained feature-level loss to enhance generation accuracy. Experiments show that it produces natural, detailed artistic glyph images with state-of-the-art performance. Our project will be open-sourced onthis https URLto advance text generation technology.
View on arXiv@article{lu2025_2504.04743, title={ AnyArtisticGlyph: Multilingual Controllable Artistic Glyph Generation }, author={ Xiongbo Lu and Yaxiong Chen and Shengwu Xiong }, journal={arXiv preprint arXiv:2504.04743}, year={ 2025 } }