Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.13505
Cited By
A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition
23 March 2023
Andong Deng
Taojiannan Yang
C. L. P. Chen
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition"
10 / 10 papers shown
Title
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Yilun Zhao
Lujing Xie
Haowei Zhang
Guo Gan
Yitao Long
...
Xiangru Tang
Zhenwen Liang
Y. Liu
Chen Zhao
Arman Cohan
45
5
0
21 Jan 2025
Self-Chained Image-Language Model for Video Localization and Question Answering
Shoubin Yu
Jaemin Cho
Prateek Yadav
Mohit Bansal
31
129
0
11 May 2023
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
87
93
0
04 Jul 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
218
682
0
13 Oct 2021
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
360
0
17 Sep 2021
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
124
193
0
23 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
MVFNet: Multi-View Fusion Network for Efficient Video Recognition
Wenhao Wu
Dongliang He
Tianwei Lin
Fu Li
Chuang Gan
Errui Ding
85
68
0
13 Dec 2020
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
198
371
0
19 Oct 2020
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
119
495
0
24 Apr 2018
1