Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.04192
Cited By
Self-Adaptive Sampling for Efficient Video Question-Answering on Image--Text Models
9 July 2023
Wei Han
Hui Chen
MingSung Kan
Soujanya Poria
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Adaptive Sampling for Efficient Video Question-Answering on Image--Text Models"
3 / 3 papers shown
Title
VideoAgent: Long-form Video Understanding with Large Language Model as Agent
Xiaohan Wang
Yuhui Zhang
Orr Zohar
Serena Yeung-Levy
VLM
103
83
0
15 Mar 2024
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1