Title |
---|
![]() MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Haotian Zhang Mingfei Gao Zhe Gan Philipp Dufter Nina Wenzel ...Haoxuan You Zirui Wang Afshin Dehghan Peter Grasch Yinfei Yang |
![]() PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Weifeng Lin Xinyu Wei Renrui Zhang Le Zhuo Shitian Zhao ...Junlin Xie Junlin Xie Yu Qiao Peng Gao Hongsheng Li |
![]() MMSearch: Benchmarking the Potential of Large Models as Multi-modal
Search Engines Dongzhi Jiang Renrui Zhang Ziyu Guo Yanmin Wu Jiayi Lei ...Guanglu Song Peng Gao Yu Liu Chunyuan Li Hongsheng Li |
![]() MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large
Language Model Zhen Yang Jinhao Chen Zhengxiao Du Wenmeng Yu Weihan Wang Wenyi Hong Zhihuan Jiang Bin Xu Yuxiao Dong Jie Tang |
![]() xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Le Xue Manli Shu Anas Awadalla Jun Wang An Yan ...Zeyuan Chen Silvio Savarese Juan Carlos Niebles Caiming Xiong Ran Xu |