Title |
---|
![]() Revisit Large-Scale Image-Caption Data in Pre-training Multimodal
Foundation Models Zhengfeng Lai Vasileios Saveris C. L. P. Chen Hong-You Chen Haotian Zhang ...Wenze Hu Zhe Gan Peter Grasch Meng Cao Yinfei Yang |
![]() ControlAR: Controllable Image Generation with Autoregressive Models Zongming Li Tianheng Cheng Shoufa Chen Peize Sun Haocheng Shen Longjin Ran Xiaoxin Chen Wenyu Liu Xinggang Wang |
![]() Diffusion & Adversarial Schr\"odinger Bridges via Iterative Proportional Markovian Fitting Sergei Kholkin Grigoriy Ksenofontov David Li Nikita Kornilov Nikita Gushchin Alexandra Suvorikova Alexey Kroshnin Evgeny Burnaev Alexander Korotin |
![]() Emu3: Next-Token Prediction is All You Need Xinlong Wang Xiaosong Zhang Zhengxiong Luo Quan-Sen Sun Yufeng Cui ...Xi Yang Jingjing Liu Yonghua Lin Tiejun Huang Zhongyuan Wang |
![]() PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Weifeng Lin Xinyu Wei Renrui Zhang Le Zhuo Shitian Zhao ...Junlin Xie Junlin Xie Yu Qiao Peng Gao Hongsheng Li |