Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.10015
Cited By
Benchmarking Spatial Relationships in Text-to-Image Generation
20 December 2022
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Benchmarking Spatial Relationships in Text-to-Image Generation"
50 / 57 papers shown
Title
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
X. Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
57
0
0
05 May 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
N. Sebe
41
0
0
18 Apr 2025
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han
Yeonkyung Lee
Chanyoung Kim
Kwanghyun Park
Seong Jae Hwang
DiffM
58
0
0
28 Mar 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohsen Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
37
0
0
09 Mar 2025
Can Large Language Models Unveil the Mysteries? An Exploration of Their Ability to Unlock Information in Complex Scenarios
Chao Wang
Luning Zhang
Z. Wang
Yang Zhou
ELM
VLM
LRM
53
1
0
27 Feb 2025
Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation
Trevine Oorloff
Yaser Yacoob
Abhinav Shrivastava
46
0
0
24 Feb 2025
Evaluating the Generation of Spatial Relations in Text and Image Generative Models
Shang Hong Sim
Clarence Lee
A. Tan
Cheston Tan
EGVM
25
2
0
12 Nov 2024
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models
Arash Marioriyad
Parham Rezaei
M. Baghshah
M. Rohban
CoGe
56
0
0
30 Oct 2024
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Arash Marioriyad
Mohammadali Banayeeanzade
Reza Abbasi
M. Rohban
M. Baghshah
DiffM
63
3
0
28 Oct 2024
KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities
Hsin-Ping Huang
X. Wang
Yonatan Bitton
Hagai Taitelbaum
Gaurav Singh Tomar
...
Xuhui Jia
Kelvin Chan
Hexiang Hu
Yu-Chuan Su
Ming Yang
EGVM
64
4
0
15 Oct 2024
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Xiangru Zhu
Penglei Sun
Yaoxian Song
Yanghua Xiao
Zhixu Li
Chengyu Wang
Jun Huang
Bei Yang
Xiaoxiao Xu
EGVM
83
1
0
14 Oct 2024
NL-Eye: Abductive NLI for Images
Mor Ventura
Michael Toker
Nitay Calderon
Zorik Gekhman
Yonatan Bitton
Roi Reichart
28
1
0
03 Oct 2024
Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
David D. Yao
Wenpin Tang
30
1
0
12 Sep 2024
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models
Agneet Chatterjee
Yiran Luo
Tejas Gokhale
Yezhou Yang
Chitta Baral
LRM
22
5
0
05 Aug 2024
Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
Core Francisco Park
Maya Okawa
Andrew Lee
Ekdeep Singh Lubana
Hidenori Tanaka
50
6
0
27 Jun 2024
CUPID: Contextual Understanding of Prompt-conditioned Image Distributions
Yayan Zhao
Mingwei Li
Matthew Berger
DiffM
27
2
0
11 Jun 2024
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Olivia Wiles
Chuhan Zhang
Isabela Albuquerque
Ivana Kajić
Su Wang
...
Jordi Pont-Tuset
Aida Nematzadeh
Anant Nawalgaria
Jordi Pont-Tuset
Aida Nematzadeh
EGVM
115
13
0
25 Apr 2024
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
Agneet Chatterjee
Tejas Gokhale
Chitta Baral
Yezhou Yang
VLM
25
2
0
12 Apr 2024
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Kanchana Ranasinghe
Satya Narayan Shukla
Omid Poursaeed
Michael S. Ryoo
Tsung-Yu Lin
LRM
38
21
0
11 Apr 2024
Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
Kyuyoung Kim
Jongheon Jeong
Minyong An
Mohammad Ghavamzadeh
Krishnamurthy Dvijotham
Jinwoo Shin
Kimin Lee
EGVM
27
6
0
02 Apr 2024
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Agneet Chatterjee
Gabriela Ben-Melech Stan
Estelle Aflalo
Sayak Paul
Dhruba Ghosh
...
Ludwig Schmidt
Hanna Hajishirzi
Vasudev Lal
Chitta Baral
Yezhou Yang
EGVM
VLM
57
14
0
01 Apr 2024
Relation Rectification in Diffusion Model
Yinwei Wu
Xingyi Yang
Xinchao Wang
28
6
0
29 Mar 2024
Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation
Yingshan Chang
Yasi Zhang
Zhiyuan Fang
Yingnian Wu
Yonatan Bisk
Feng Gao
EGVM
26
4
0
25 Mar 2024
Discriminative Probing and Tuning for Text-to-Image Generation
Leigang Qu
Wenjie Wang
Yongqi Li
Hanwang Zhang
Liqiang Nie
Tat-Seng Chua
31
7
0
07 Mar 2024
When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability
Wenjie Xuan
Yufei Xu
Shanshan Zhao
Chaoyue Wang
Juhua Liu
Bo Du
Dacheng Tao
24
2
0
01 Mar 2024
Spatial-Aware Latent Initialization for Controllable Image Generation
Wenqiang Sun
Tengtao Li
Zehong Lin
Jun Zhang
19
10
0
29 Jan 2024
Explicitly Representing Syntax Improves Sentence-to-layout Prediction of Unexpected Situations
Wolf Nuyts
Ruben Cartuyvels
Marie-Francine Moens
23
1
0
25 Jan 2024
Large-scale Reinforcement Learning for Diffusion Models
Yinan Zhang
Eric Tzeng
Yilun Du
Dmitry Kislyuk
VLM
19
29
0
20 Jan 2024
GlitchBench: Can large multimodal models detect video game glitches?
Mohammad Reza Taesiri
Tianjun Feng
Anh Nguyen
C. Bezemer
MLLM
VLM
LRM
22
9
0
08 Dec 2023
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
Zirui Wang
Zhizhou Sha
Zheng Ding
Yilin Wang
Zhuowen Tu
DiffM
16
21
0
06 Dec 2023
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon
Yonatan Bitton
Yonatan Shafir
Roopal Garg
Xi Chen
Dani Lischinski
Daniel Cohen-Or
Idan Szpektor
35
11
0
05 Dec 2023
A Contrastive Compositional Benchmark for Text-to-Image Synthesis: A Study with Unified Text-to-Image Fidelity Metrics
Xiangru Zhu
Penglei Sun
Chengyu Wang
Jingping Liu
Zhixu Li
Yanghua Xiao
Jun Huang
CoGe
100
5
0
04 Dec 2023
LucidDreaming: Controllable Object-Centric 3D Generation
Zhaoning Wang
Ming Li
C. L. P. Chen
26
10
0
30 Nov 2023
Unlocking Spatial Comprehension in Text-to-Image Diffusion Models
Mohammad Mahdi Derakhshani
Menglin Xia
Harkirat Singh Behl
Cees G. M. Snoek
Victor Rühle
11
2
0
28 Nov 2023
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Biao Gong
Siteng Huang
Yutong Feng
Shiwei Zhang
Yuyuan Li
Yu Liu
DiffM
10
11
0
27 Nov 2023
What's "up" with vision-language models? Investigating their struggle with spatial reasoning
Amita Kamath
Jack Hessel
Kai-Wei Chang
LRM
CoGe
13
94
0
30 Oct 2023
Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task
Maya Okawa
Ekdeep Singh Lubana
Robert P. Dick
Hidenori Tanaka
CoGe
DiffM
35
43
0
13 Oct 2023
R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation
Jiayu Xiao
Henglei Lv
Liang Li
Shuhui Wang
Qingming Huang
DiffM
21
14
0
13 Oct 2023
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods
Samyadeep Basu
Mehrdad Saberi
S. Bhardwaj
Atoosa Malemir Chegini
Daniela Massiceti
Maziar Sanjabi
S. Hu
S. Feizi
47
16
0
03 Oct 2023
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
Jonas Belouadi
Anne Lauscher
Steffen Eger
16
26
0
30 Sep 2023
From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition
Maan Qraitem
Kate Saenko
Bryan A. Plummer
29
4
0
08 Aug 2023
Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models
Zheng Ma
Mianzhi Pan
Wenhan Wu
Ka Leong Cheng
Jianbing Zhang
Shujian Huang
Jiajun Chen
VLM
CoGe
16
3
0
06 Aug 2023
TIAM -- A Metric for Evaluating Alignment in Text-to-Image Generation
P. Grimal
Hervé Le Borgne
Olivier Ferret
Julien Tourille
EGVM
29
10
0
11 Jul 2023
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
Nan Liu
Yilun Du
Shuang Li
J. Tenenbaum
Antonio Torralba
DiffM
CoGe
6
24
0
08 Jun 2023
ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
Maitreya Patel
Tejas Gokhale
Chitta Baral
Yezhou Yang
CoGe
6
14
0
07 Jun 2023
Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search
Qihao Liu
Adam Kortylewski
Yutong Bai
Song Bai
Alan Yuille
DiffM
21
12
0
01 Jun 2023
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang
Yi Zhang
Vibhav Vineet
Neel Joshi
Xin Eric Wang
DiffM
14
41
0
29 May 2023
DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Ying Fan
Olivia Watkins
Yuqing Du
Hao Liu
Moonkyung Ryu
Craig Boutilier
Pieter Abbeel
Mohammad Ghavamzadeh
Kangwook Lee
Kimin Lee
28
133
0
25 May 2023
Visual Programming for Text-to-Image Generation and Evaluation
Jaemin Cho
Abhaysinh Zala
Mohit Bansal
MLLM
10
50
0
24 May 2023
Transferring Visual Attributes from Natural Language to Verified Image Generation
Rodrigo Valerio
João Bordalo
Michal Yarom
Yonattan Bitton
Idan Szpektor
João Magalhães
21
5
0
24 May 2023
1
2
Next