Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.10639
Cited By
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
13 March 2025
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
Shilin Yan
Hao Tian
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
3 / 3 papers shown
Title
WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
D. Zhang
Che Jiang
Ruoshi Xu
Biaoxiang Chen
Zijian Jin
Yutian Lu
Jianguo Zhang
Liang Yong
Jiebo Luo
Shengda Luo
VLM
24
16
0
02 May 2025
Step1X-Edit: A Practical Framework for General Image Editing
S. Liu
Yucheng Han
Peng Xing
Fukun Yin
Rui Wang
...
Yibo Zhu
Binxing Jiao
X. Zhang
Gang Yu
Daxin Jiang
DiffM
88
69
0
24 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
71
7
0
03 Apr 2025
1