ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.17794
37
0

Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models

22 March 2025
Ketan Suhaas Saichandran
Xavier Thomas
Prakhar Kaushik
Deepti Ghadiyaram
    DiffM
ArXivPDFHTML
Abstract

Text-to-image generative models often struggle with long prompts detailing complex scenes, diverse objects with distinct visual characteristics and spatial relationships. In this work, we propose SCoPE (Scheduled interpolation of Coarse-to-fine Prompt Embeddings), a training-free method to improve text-to-image alignment by progressively refining the input prompt in a coarse-to-fine-grained manner. Given a detailed input prompt, we first decompose it into multiple sub-prompts which evolve from describing broad scene layout to highly intricate details. During inference, we interpolate between these sub-prompts and thus progressively introduce finer-grained details into the generated image. Our training-free plug-and-play approach significantly enhances prompt alignment, achieves an average improvement of up to +4% in Visual Question Answering (VQA) scores over the Stable Diffusion baselines on 85% of the prompts from the GenAI-Bench dataset.

View on arXiv
@article{saichandran2025_2503.17794,
  title={ Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models },
  author={ Ketan Suhaas Saichandran and Xavier Thomas and Prakhar Kaushik and Deepti Ghadiyaram },
  journal={arXiv preprint arXiv:2503.17794},
  year={ 2025 }
}
Comments on this paper