Title |
---|
![]() MMSearch: Benchmarking the Potential of Large Models as Multi-modal
Search Engines Dongzhi Jiang Renrui Zhang Ziyu Guo Yanmin Wu Jiayi Lei ...Guanglu Song Peng Gao Yu Liu Chunyuan Li Hongsheng Li |
![]() VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths
Vision Computation Shiwei Wu Joya Chen Kevin Qinghong Lin Qimeng Wang Yan Gao Qianli Xu Tong Bill Xu Yao Hu Enhong Chen Mike Zheng Shou |
![]() CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Zhuoyi Yang Jiayan Teng Wendi Zheng Ming Ding Shiyu Huang ...Weihan Wang Yean Cheng Xiaotao Gu Yuxiao Dong Jie Tang |