v1v2v3v4v5 (latest)

An Exploration of Default Images in Text-to-Image Generation

14 May 2025

ArXiv (abs)PDF HTML Github (9832★)

Main:22 Pages

13 Figures

Bibliography:6 Pages

2 Tables

Appendix:3 Pages

Abstract

In the creative practice of text-to-image (TTI) generation, images are synthesized from textual prompts. By design, TTI models always yield an output, even if the prompt contains unknown terms. In this case, the model may generate default images: images that closely resemble each other across many unrelated prompts. Studying default images is valuable for designing better solutions for prompt engineering and TTI generation. We present the first investigation into default images on Midjourney. We describe an initial study in which we manually created input prompts triggering default images, and several ablation studies. Building on these, we conduct a computational analysis of over 750,000 images, revealing consistent default images across unrelated prompts. We also conduct an online user study investigating how default images may affect user satisfaction. Our work lays the foundation for understanding default images in TTI generation, highlighting their practical relevance as well as challenges and future research directions.

View on arXiv

Comments on this paper