Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2502.20172
Cited By
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think
27 February 2025
L. Chen
S. Bai
Wenhao Chai
Weichu Xie
Haozhe Zhao
Leon Vinci
Junyang Lin
Baobao Chang
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (28 upvotes)
Papers citing
"Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think"
12 / 12 papers shown
Title
LaTo: Landmark-tokenized Diffusion Transformer for Fine-grained Human Face Editing
Zhenghao Zhang
Ziying Zhang
Junchao Liao
Xiangyu Meng
Qiang Hu
Siyu Zhu
Xiaoyun Zhang
Long Qin
Weizhi Wang
44
0
0
30 Sep 2025
Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Yuxin Song
Wenkai Dong
Shizun Wang
Qi Zhang
Song Xue
...
H. Yang
Haocheng Feng
Hang Zhou
Xinyan Xiao
Jingdong Wang
DiffM
MLLM
49
0
0
30 Sep 2025
Does FLUX Already Know How to Perform Physically Plausible Image Composition?
Shilin Lu
Zhuming Lian
Zihan Zhou
Shaocong Zhang
Chen Zhao
A. Kong
89
10
0
25 Sep 2025
SeedEdit 3.0: Fast and High-Quality Generative Image Editing
Peng Wang
Yichun Shi
Xiaochen Lian
Zhonghua Zhai
Xin Xia
Xuefeng Xiao
Weilin Huang
Jianchao Yang
248
14
0
05 Jun 2025
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model
Qingyu Shi
Jinbin Bai
Zhuoran Zhao
Wenhao Chai
Kaidong Yu
...
Shuangyong Song
Yunhai Tong
Xiangtai Li
X. Li
Shuicheng Yan
189
11
0
29 May 2025
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Yu Zhang
Jialei Zhou
Xinchen Li
Tao Gui
Zhongwei Wan
Tianyu Wang
Duoqian Miao
Changwei Wang
LongBing Cao
DiffM
173
4
0
25 May 2025
Step1X-Edit: A Practical Framework for General Image Editing
Shixuan Liu
Yucheng Han
Peng Xing
Fukun Yin
Rui Wang
...
Yibo Zhu
Binxing Jiao
Wei Wei
Gang Yu
Daxin Jiang
DiffM
561
104
0
24 Apr 2025
Science-T2I: Addressing Scientific Illusions in Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2025
Jialuo Li
Wenhao Chai
Xingyu Fu
Haiyang Xu
Saining Xie
MedIm
150
5
0
17 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Lichao Sun
MLLM
EGVM
312
45
0
03 Apr 2025
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models
Ruizhe Chen
Wenhao Chai
Zhifei Yang
Xiaotian Zhang
Qiufeng Wang
Tony Q.S. Quek
Soujanya Poria
Zuozhu Liu
266
2
0
06 Mar 2025
FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing
Jinya Sakurai
Issei Sato
269
3
0
06 Feb 2025
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
382
535
0
16 May 2024
1