Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.06158
Cited By
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning
12 August 2024
Mushui Liu
Bozheng Li
Yunlong Yu
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning"
14 / 14 papers shown
Title
VEU-Bench: Towards Comprehensive Understanding of Video Editing
Bozheng Li
Y. Wu
Yi Lu
Jiashuo Yu
Licheng Tang
Jiawang Cao
Wenqing Zhu
Yuyang Sun
Jay Wu
Wenbo Zhu
34
0
0
24 Apr 2025
Order Matters: On Parameter-Efficient Image-to-Video Probing for Recognizing Nearly Symmetric Actions
Thinesh Thiyakesan Ponbagavathi
Alina Roitberg
34
0
0
31 Mar 2025
On the Limitations of Vision-Language Models in Understanding Image Transforms
Ahmad Mustafa Anis
Hasnain Ali
Saquib Sarfraz
VLM
Presented at
ResearchTrend Connect | VLM
on
28 Mar 2025
139
0
0
12 Mar 2025
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
37
7
0
23 Jan 2025
First-place Solution for Streetscape Shop Sign Recognition Competition
Bin Wang
Li Jing
67
0
0
06 Jan 2025
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance
Wenhao Sun
Benlei Cui
Xue-mei Dong
Jingqun Tang
Yi Liu
DiffM
113
12
0
17 Dec 2024
Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition
Bozheng Li
Mushui Liu
Gaoang Wang
Yunlong Yu
15
5
0
22 Aug 2024
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Wenhao Wu
Xiaohan Wang
Haipeng Luo
Jingdong Wang
Yi Yang
Wanli Ouyang
94
47
0
31 Dec 2022
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
87
93
0
04 Jul 2022
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
142
360
0
24 Jan 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
554
0
28 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
360
0
17 Sep 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
129
127
0
03 Mar 2020
1