Title |
---|
![]() MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Haotian Zhang Mingfei Gao Zhe Gan Philipp Dufter Nina Wenzel ...Haoxuan You Zirui Wang Afshin Dehghan Peter Grasch Yinfei Yang |
![]() xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed
Representations Can Qin Congying Xia Krithika Ramakrishnan Michael S Ryoo Lifu Tu ...Silvio Savarese Juan Carlos Niebles Zeyuan Chen Ran Xu Caiming Xiong |
![]() Long Context Transfer from Language to Vision Peiyuan Zhang Kaichen Zhang Bo Li Guangtao Zeng Jingkang Yang Yuanhan Zhang Ziyue Wang Haoran Tan Chunyuan Li Ziwei Liu |