DALLE-2 is Seeing Double: Flaws in Word-to-Concept Mapping in Text2Image Models

BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022

19 October 2022

Papers citing "DALLE-2 is Seeing Double: Flaws in Word-to-Concept Mapping in Text2Image Models"

40 / 40 papers shown

Title
Un-Doubling Diffusion: LLM-guided Disambiguation of Homonym Duplication Evgeny Kaskov Elizaveta Petrova Petr Surovtsev Anna Kostikova Ilya Mistiurin A. Kapitanov Alexander Nagaev DiffM 225 0 0 25 Sep 2025
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models Guy Kaplan Michael Toker Yuval Reif Yonatan Belinkov Roy Schwartz DiffM 319 2 0 01 Apr 2025
Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation H. Seo Junseo Bang Haechang Lee Joohoon Lee Byung Hyun Lee Se Young Chun 264 0 0 29 Mar 2025
Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models Veronika Smilga 157 1 0 11 Jan 2025
GRADE: Quantifying Sample Diversity in Text-to-Image Models Royi Rassin Aviv Slobodkin Shauli Ravfogel Yanai Elazar Yoav Goldberg 793 5 0 29 Oct 2024
Swing-by Dynamics in Concept Learning and Compositional GeneralizationInternational Conference on Learning Representations (ICLR), 2024 Yongyi Yang Core Francisco Park Ekdeep Singh Lubana Maya Okawa Wei Hu Hidenori Tanaka CoGe DiffM 234 0 0 10 Oct 2024
TextureMeDefect: LLM-based Defect Texture Generation for Railway Components on Mobile Devices Rahatara Ferdousi M. Anwar Hossain Abdulmotaleb El Saddik 74 1 0 07 Oct 2024
Classification-Denoising Networks Louis Thiry Florentin Guth 240 1 0 04 Oct 2024
Inverse Painting: Reconstructing The Painting ProcessACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2024 B. Chen Yifan Wang Brian L. Curless Ira Kemelmacher-Shlizerman Steven M. Seitz DiffM 212 5 0 30 Sep 2024
DefectTwin: When LLM Meets Digital Twin for Railway Defect Inspection Rahatara Ferdousi M. Anwar Hossain Chunsheng Yang Abdulmotaleb El Saddik 75 6 0 26 Aug 2024
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Hila Gonen Terra Blevins Alisa Liu Luke Zettlemoyer Noah A. Smith 412 10 0 12 Aug 2024
Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space Core Francisco Park Maya Okawa Andrew Lee Ekdeep Singh Lubana Hidenori Tanaka 299 25 0 27 Jun 2024
Make It Count: Text-to-Image Generation with an Accurate Number of ObjectsComputer Vision and Pattern Recognition (CVPR), 2024 Lital Binyamin Yoad Tewel Hilit Segev Eran Hirsch Royi Rassin Gal Chechik 204 30 0 14 Jun 2024
DiffusionPID: Interpreting Diffusion via Partial Information DecompositionNeural Information Processing Systems (NeurIPS), 2024 Shaurya Dewan Rushikesh Zawar Prakanshul Saxena Yingshan Chang Andrew F. Luo Yonatan Bisk DiffM 313 8 0 07 Jun 2024
Text Guided Image Editing with Automatic Concept Locating and Forgetting Jia Li Lijie Hu Zhixian He Jingfeng Zhang Tianhang Zheng Haiyan Zhao DiffM 192 13 0 30 May 2024
Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated ImagesIEEE Transactions on Human-Machine Systems (IEEE Trans. Human-Machine Syst.), 2024 Memoona Aziz Umair Rehman Muhammad Umair Danish Katarina Grolinger EGVM 157 15 0 15 May 2024
Language in Vivo vs. in Silico: Size Matters but Larger Language Models Still Do Not Comprehend Language on a Par with Humans Due to Impenetrable Semantic Reference Vittoria Dentella Fritz Guenther Evelina Leivada ELM 302 5 0 23 Apr 2024
Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control Maria Mihaela Truşcǎ Wolf Nuyts Jonathan Thomm Robert Honig Thomas Hofmann Tinne Tuytelaars Marie-Francine Moens 83 7 0 21 Apr 2024
Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion ModelsEuropean Conference on Computer Vision (ECCV), 2024 Yasi Zhang Peiyu Yu Yingnian Wu DiffM 181 18 0 10 Apr 2024
Human-Centric Aware UAV Trajectory Planning in Search and Rescue Missions Employing Multi-Objective Reinforcement Learning with AHP and Similarity-Based Experience Replay Mahya Ramezani J. L. Sánchez-López 142 10 0 28 Feb 2024
Explicitly Representing Syntax Improves Sentence-to-layout Prediction of Unexpected SituationsTransactions of the Association for Computational Linguistics (TACL), 2024 Wolf Nuyts Ruben Cartuyvels Marie-Francine Moens 273 2 0 25 Jan 2024
Mismatch Quest: Visual and Textual Feedback for Image-Text MisalignmentEuropean Conference on Computer Vision (ECCV), 2023 Brian Gordon Yonatan Bitton Yonatan Shafir Roopal Garg Xi Chen Dani Lischinski Daniel Cohen-Or Idan Szpektor 186 17 0 05 Dec 2023
The Quo Vadis of the Relationship between Language and Large Language Models Evelina Leivada Vittoria Dentella Elliot Murphy 185 6 0 17 Oct 2023
Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic TaskNeural Information Processing Systems (NeurIPS), 2023 Maya Okawa Ekdeep Singh Lubana Robert P. Dick Hidenori Tanaka CoGe DiffM 425 81 0 13 Oct 2023
Exploring Human's Gender Perception and Bias toward Non-Humanoid Robots Mahya Ramezani Jose Luis Sanchez-Lopez 108 0 0 21 Sep 2023
Total Selfie: Generating Full-Body SelfiesComputer Vision and Pattern Recognition (CVPR), 2023 B. Chen Brian L. Curless Ira Kemelmacher-Shlizerman S. M. Seitz DiffM 143 7 0 28 Aug 2023
Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map AlignmentNeural Information Processing Systems (NeurIPS), 2023 Royi Rassin Eran Hirsch Daniel Glickman Shauli Ravfogel Yoav Goldberg Gal Chechik DiffM 446 146 0 15 Jun 2023
The Hidden Language of Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2023 Hila Chefer Oran Lang Mor Geva Volodymyr Polosukhin Assaf Shocher Michal Irani Inbar Mosseri Lior Wolf DiffM 279 32 0 01 Jun 2023
Transferring Visual Attributes from Natural Language to Verified Image Generation Rodrigo Valerio João Bordalo Michal Yarom Yonattan Bitton Idan Szpektor João Magalhães 143 5 0 24 May 2023
What You See is What You Read? Improving Text-Image Alignment EvaluationNeural Information Processing Systems (NeurIPS), 2023 Michal Yarom Yonatan Bitton Soravit Changpinyo Roee Aharoni Jonathan Herzig Oran Lang E. Ofek Idan Szpektor EGVM 447 115 0 17 May 2023
Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss InformationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Sunjae Kwon Rishabh Garodia Linghe Wang Zhichao Yang Hong-ye Yu CoGe 265 5 0 02 May 2023
DreamPose: Fashion Image-to-Video Synthesis via Stable DiffusionIEEE International Conference on Computer Vision (ICCV), 2023 J. Karras Aleksander Holynski Ting-Chun Wang Ira Kemelmacher-Shlizerman DiffM VGen 265 195 0 12 Apr 2023
Testing AI on language comprehension tasks reveals insensitivity to underlying meaningScientific Reports (Sci Rep), 2023 Vittoria Dentella Fritz Guenther Elliot Murphy G. Marcus Evelina Leivada ELM 302 50 0 23 Feb 2023
Teaching CLIP to Count to TenIEEE International Conference on Computer Vision (ICCV), 2023 Roni Paiss Ariel Ephrat Omer Tov Shiran Zada Inbar Mosseri Michal Irani Tali Dekel VLM CLIP 343 151 0 23 Feb 2023
Affect-Conditioned Image GenerationIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2023 F. Ibarrola R. Lulham Kazjon Grace DiffM 135 5 0 20 Feb 2023
Auditing Gender Presentation Differences in Text-to-Image ModelsConference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO), 2023 Yanzhe Zhang Lu Jiang Greg Turk Diyi Yang EGVM 261 27 0 07 Feb 2023
When are Lemons Purple? The Concept Association Bias of Vision-Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Yutaro Yamada Yingtian Tang Yoyo Zhang Ilker Yildirim CoGe 233 21 0 22 Dec 2022
Schrödinger's Bat: Diffusion Models Sometimes Generate Polysemous Words in Superposition Jennifer C. White Robert Bamler DiffM 162 7 0 23 Nov 2022
Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling ApproachesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Daniel Fried Nicholas Tomlin Jennifer Hu Roma Patel Aida Nematzadeh 183 9 0 15 Nov 2022
DALL-E 2 Fails to Reliably Capture Common Syntactic Processes Evelina Leivada Elliot Murphy G. Marcus 269 44 0 23 Oct 2022