Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.17946
Cited By
DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback
29 November 2023
Jiao Sun
Deqing Fu
Yushi Hu
Su Wang
Royi Rassin
Da-Cheng Juan
Dana Alon
Charles Herrmann
Sjoerd van Steenkiste
Ranjay Krishna
Cyrus Rashtchian
EGVM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback"
32 / 32 papers shown
Title
Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map
Alessandro Simoni
Francesco Pelosin
38
0
0
06 May 2025
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
Mingcheng Li
Xiaolu Hou
Ziyang Liu
Dingkang Yang
Ziyun Qian
Jiawei Chen
Jinjie Wei
Y. Jiang
Qingyao Xu
L. Zhang
DiffM
44
0
0
05 May 2025
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Chenhan Jiang
Yihan Zeng
Hang Xu
Dit-Yan Yeung
44
0
0
28 Apr 2025
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Shivam Duggal
Yushi Hu
Oscar Michel
Aniruddha Kembhavi
William T. Freeman
Noah A. Smith
Ranjay Krishna
Antonio Torralba
Ali Farhadi
Wei-Chiu Ma
EGVM
ELM
67
0
0
25 Apr 2025
Science-T2I: Addressing Scientific Illusions in Image Synthesis
Jialuo Li
Wenhao Chai
Xingyu Fu
Haiyang Xu
Saining Xie
MedIm
38
0
0
17 Apr 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Y. Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
54
0
0
21 Mar 2025
LEGION: Learning to Ground and Explain for Synthetic Image Detection
Hengrui Kang
Siwei Wen
Zichen Wen
Junyan Ye
Weijia Li
...
Baichuan Zhou
Bin Wang
D. Lin
Linfeng Zhang
Conghui He
42
0
0
19 Mar 2025
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Arash Marioriyad
Mohammadali Banayeeanzade
Reza Abbasi
M. Rohban
M. Baghshah
DiffM
61
3
0
28 Oct 2024
Scalable Ranked Preference Optimization for Text-to-Image Generation
Shyamgopal Karthik
Huseyin Coskun
Zeynep Akata
Sergey Tulyakov
J. Ren
Anil Kag
EGVM
52
4
0
23 Oct 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Rui Zhao
Hangjie Yuan
Yujie Wei
Shiwei Zhang
Yuchao Gu
...
Xiang Wang
Zhangjie Wu
Junhao Zhang
Yingya Zhang
Mike Zheng Shou
DiffM
VLM
50
2
0
09 Oct 2024
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Xinchen Zhang
Ling Yang
G. Li
Yaqi Cai
Jiake Xie
Yong Tang
Yujiu Yang
Mengdi Wang
Bin Cui
EGVM
CoGe
28
5
0
09 Oct 2024
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Deqing Fu
Tong Xiao
Rui Wang
Wang Zhu
Pengchuan Zhang
Guan Pang
Robin Jia
Lawrence Chen
55
5
0
07 Oct 2024
GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design
Phillip Mueller
Sebastian Mueller
Lars Mikelsons
23
1
0
25 Sep 2024
ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images
Abhinaw Jagtap
Nachiket Tapas
R. G. Brajesh
EGVM
18
0
0
18 Sep 2024
A Survey on LoRA of Large Language Models
Yuren Mao
Yuhang Ge
Yijiang Fan
Wenyi Xu
Yu Mi
Zhonghao Hu
Yunjun Gao
ALM
52
22
0
08 Jul 2024
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Baiqi Li
Zhiqiu Lin
Deepak Pathak
Jiayao Li
Yixin Fei
...
Tiffany Ling
Xide Xia
Pengchuan Zhang
Graham Neubig
Deva Ramanan
EGVM
42
24
0
19 Jun 2024
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Dongzhi Jiang
Guanglu Song
Xiaoshi Wu
Renrui Zhang
Dazhong Shen
Zhuofan Zong
Yu Liu
Hongsheng Li
VLM
28
20
0
04 Apr 2024
Evaluating Text-to-Visual Generation with Image-to-Text Generation
Zhiqiu Lin
Deepak Pathak
Baiqi Li
Jiayao Li
Xide Xia
Graham Neubig
Pengchuan Zhang
Deva Ramanan
EGVM
31
125
0
01 Apr 2024
Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation
Yixin Wan
Arjun Subramonian
Anaelia Ovalle
Zongyu Lin
Ashima Suvarna
Christina Chance
Hritik Bansal
Rebecca Pattichis
Kai-Wei Chang
EGVM
36
26
0
01 Apr 2024
VersaT2I: Improving Text-to-Image Models with Versatile Reward
Jianshu Guo
Wenhao Chai
Jie Deng
Hsiang-Wei Huang
Tianbo Ye
Yichen Xu
Jiawei Zhang
Jenq-Neng Hwang
Gaoang Wang
VLM
28
15
0
27 Mar 2024
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Oscar Manas
Pietro Astolfi
Melissa Hall
Candace Ross
Jack Urbanek
Adina Williams
Aishwarya Agrawal
Adriana Romero Soriano
M. Drozdzal
29
26
0
26 Mar 2024
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
Jingkun An
Yinghao Zhu
Zongjian Li
Haoran Feng
Bohua Chen
Yemin Shi
Chengwei Pan
16
2
0
20 Mar 2024
Reward Guided Latent Consistency Distillation
Jiachen Li
Weixi Feng
Wenhu Chen
William Yang Wang
EGVM
21
11
0
16 Mar 2024
SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data
Jialu Li
Jaemin Cho
Yi-Lin Sung
Jaehong Yoon
Mohit Bansal
MoMe
DiffM
34
8
0
11 Mar 2024
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Xiwei Hu
Rui Wang
Yixiao Fang
Bin-Bin Fu
Pei Cheng
Gang Yu
VLM
54
39
0
08 Mar 2024
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Xinchen Zhang
Ling Yang
Yaqi Cai
Zhaochen Yu
Kai-Ni Wang
...
Ye Tian
Minkai Xu
Yong Tang
Yujiu Yang
Bin Cui
DiffM
22
5
0
20 Feb 2024
On Catastrophic Inheritance of Large Foundation Models
Hao Chen
Bhiksha Raj
Xing Xie
Jindong Wang
AI4CE
48
12
0
02 Feb 2024
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Ling Yang
Zhaochen Yu
Chenlin Meng
Minkai Xu
Stefano Ermon
Bin Cui
CoGe
DiffM
22
113
0
22 Jan 2024
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation
Yuval Kirstain
Adam Polyak
Uriel Singer
Shahbuland Matiana
Joe Penna
Omer Levy
EGVM
160
345
0
02 May 2023
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
Nitzan Bitton-Guetta
Yonatan Bitton
Jack Hessel
Ludwig Schmidt
Yuval Elovici
Gabriel Stanovsky
Roy Schwartz
VLM
121
65
0
13 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
1