Title |
---|
![]() End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting Yongqi Wang Xinxiao Wu Shuo Yang Jiebo Luo |
![]() A Survey on Image-text Multimodal Models Ruifeng Guo Jingxuan Wei Linzhuang Sun Khai Le-Duc Guiyong Chang Dawei Liu Sibo Zhang Zhengbing Yao Mingjun Xu Liping Bu |