Is What You Ask For What You Get? Investigating Concept Associations in Text-to-Image Models

Text-to-image (T2I) models are increasingly used in impactful real-life applications. As such, there is a growing need to audit these models to ensure that they generate desirable, task-appropriate images. However, systematically inspecting the associations between prompts and generated content in a human-understandable way remains challenging. To address this, we propose Concept2Concept, a framework where we characterize conditional distributions of vision language models using interpretable concepts and metrics that can be defined in terms of these concepts. This characterization allows us to use our framework to audit models and prompt-datasets. To demonstrate, we investigate several case studies of conditional distributions of prompts, such as user-defined distributions or empirical, real-world distributions. Lastly, we implement Concept2Concept as an open-source interactive visualization tool to facilitate use by non-technical end-users. A demo is available atthis https URL.
View on arXiv@article{magid2025_2410.04634, title={ Is What You Ask For What You Get? Investigating Concept Associations in Text-to-Image Models }, author={ Salma S. Abdel Magid and Weiwei Pan and Simon Warchol and Grace Guo and Junsik Kim and Mahia Rahman and Hanspeter Pfister }, journal={arXiv preprint arXiv:2410.04634}, year={ 2025 } }