Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2507.22058
Cited By
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again
29 July 2025
Zigang Geng
Y. Wang
Yeyao Ma
Chen Li
Yongming Rao
Shuyang Gu
Zhao Zhong
Qinglin Lu
Han Hu
Xiaosong Zhang
Linus
Di Wang
Jie Jiang
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (35 upvotes)
Github (24237★)
Papers citing
"X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again"
11 / 11 papers shown
Title
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
Yibin Wang
Zhimin Li
Yuhang Zang
Jiazi Bu
Yujie Zhou
...
Junjun He
Chunyu Wang
Qinglin Lu
Cheng Jin
J. Wang
EGVM
VLM
141
1
0
21 Oct 2025
Heptapod: Language Modeling on Visual Signals
Yongxin Zhu
J. Chen
Yuanzhe Chen
Zhuo Chen
Dongya Jia
Jian Cong
Xiaobin Zhuang
Yuping Wang
Yuping Wang
VLM
93
0
0
08 Oct 2025
Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models
Tianren Ma
Mu Zhang
Yibing Wang
Qixiang Ye
40
0
0
03 Oct 2025
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
Jitai Hao
Hao Liu
Xinyan Xiao
Qiang Huang
Jun Yu
56
0
0
29 Sep 2025
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Yanghao Li
Rui Qian
Bowen Pan
Haotian Zhang
Haoshuo Huang
...
Zhengdong Zhang
Chen Chen
Yang Zhao
Ruoming Pang
Zhifeng Chen
MLLM
128
1
0
19 Sep 2025
Unified Multimodal Model as Auto-Encoder
Zhiyuan Yan
Kaiqing Lin
Zongjian Li
Junyan Ye
Hui Han
...
Xue Xu
Xinyan Xiao
Jingdong Wang
Haifeng Wang
Li Yuan
194
1
0
11 Sep 2025
Reconstruction Alignment Improves Unified Multimodal Models
Ji Xie
Trevor Darrell
Luke Zettlemoyer
Xudong Wang
94
6
0
08 Sep 2025
Reinforcement Learning in Vision: A Survey
Weijia Wu
Chen Gao
Joya Chen
Kevin Lin
Qingwei Meng
Yiming Zhang
Yuke Qiu
Hong Zhou
Mike Zheng Shou
165
2
0
11 Aug 2025
Qwen-Image Technical Report
Chenfei Wu
Jiahao Nick Li
Jingren Zhou
Junyang Lin
Kaiyuan Gao
...
Yichang Zhang
Yongqiang Zhu
Y. Wu
Yuxuan Cai
Zenan Liu
DiffM
VLM
120
92
0
04 Aug 2025
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Haokun Lin
Teng Wang
Yixiao Ge
Yuying Ge
Zhichao Lu
Ying Wei
Gang Qu
Zhenan Sun
Mingyu Ding
MLLM
VLM
326
27
0
08 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
779
17
0
05 May 2025
1