Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.19874
Cited By
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation
26 May 2025
Yi Wu
Lingting Zhu
Shengju Qian
Lei Liu
Wandi Qiao
Lequan Yu
Bin Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation"
18 / 18 papers shown
Title
DanceGRPO: Unleashing GRPO on Visual Generation
Zeyue Xue
Jie Wu
Yu Gao
Fangyuan Kong
Lingting Zhu
...
Zhiheng Liu
Wei Liu
Qiushan Guo
Weilin Huang
Ping Luo
EGVM
VGen
73
3
0
12 May 2025
Personalized Text-to-Image Generation with Auto-Regressive Models
Kaiyue Sun
Xian Liu
Yao Teng
Xihui Liu
67
1
0
17 Apr 2025
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
Junke Wang
Zhi Tian
Xinyu Wang
Xinyu Zhang
Weilin Huang
Zuxuan Wu
Yu Jiang
VGen
118
13
0
15 Apr 2025
Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
Yi Wu
Lingting Zhu
Lei Liu
Wandi Qiao
Ziqiang Li
Lequan Yu
Bin Li
DiffM
78
1
0
13 Mar 2025
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Xiaokang Chen
Zhiyu Wu
Xingchao Liu
Zizheng Pan
Wen Liu
Zhenda Xie
X. Yu
Chong Ruan
AI4TS
68
126
0
29 Jan 2025
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
107
54
0
05 Aug 2024
StyleShot: A Snapshot on Any Style
Junyao Gao
Yanchen Liu
Yanan Sun
Yinhao Tang
Yanhong Zeng
Kai Chen
Cairong Zhao
TTA
3DH
VLM
101
17
0
01 Jul 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
113
290
0
16 May 2024
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
Haofan Wang
Matteo Spinelli
Qixun Wang
Xu Bai
Zekui Qin
Anthony Chen
DiffM
71
89
0
03 Apr 2024
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace
Meihua Dang
Rafael Rafailov
Linqi Zhou
Aaron Lou
Senthil Purushwalkam
Stefano Ermon
Caiming Xiong
Shafiq Joty
Nikhil Naik
EGVM
79
251
0
21 Nov 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
84
360
0
12 Apr 2023
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
196
2,789
0
25 Aug 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
283
3,458
0
29 Apr 2022
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
136
1,399
0
07 Mar 2022
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
135
1,003
0
04 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
681
28,659
0
26 Feb 2021
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
161
4,928
0
02 Nov 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
453
129,831
0
12 Jun 2017
1