Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1712.04851
Cited By
v1
v2 (latest)
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
13 December 2017
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"
50 / 675 papers shown
Masked Autoencoder for Unsupervised Video Summarization
Minho Shim
Taeoh Kim
Jinhyung Kim
Dongyoon Wee
176
3
0
02 Jun 2023
Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning
European Conference on Computer Vision (ECCV), 2023
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LRM
LM&Ro
311
5
0
26 May 2023
Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective
Neurocomputing (Neurocomputing), 2023
Thanh-Dat Truong
Khoa Luu
EgoV
393
15
0
25 May 2023
TG-VQA: Ternary Game of Video Question Answering
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Hao Li
Peng Jin
Ze-Long Cheng
Songyang Zhang
Kai-xiang Chen
Zhennan Wang
Chang-rui Liu
Jie Chen
241
12
0
17 May 2023
Lightweight Delivery Detection on Doorbell Cameras
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Pirazh Khorramshahi
Zhe Wu
Tianchen Wang
Luke Deluccia
Hongcheng Wang
199
0
0
13 May 2023
Visual Tuning
ACM Computing Surveys (ACM Comput. Surv.), 2023
Bruce X. B. Yu
Jianlong Chang
Haixin Wang
Lin Liu
Shijie Wang
...
Lingxi Xie
Haojie Li
Zhouchen Lin
Qi Tian
Chang Wen Chen
VLM
440
60
0
10 May 2023
Improve Video Representation with Temporal Adversarial Augmentation
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Jinhao Duan
Quanfu Fan
Hao-Ran Cheng
Xiaoshuang Shi
Kaidi Xu
AAML
AI4TS
ViT
244
3
0
28 Apr 2023
SSTM: Spatiotemporal Recurrent Transformers for Multi-frame Optical Flow Estimation
Neurocomputing (Neurocomputing), 2023
Fisseha Admasu Ferede
M. Balasubramanian
135
4
0
26 Apr 2023
MRSN: Multi-Relation Support Network for Video Action Detection
IEEE International Conference on Multimedia and Expo (ICME), 2023
Yin-Dong Zheng
Guo Chen
Minglei Yuan
Tong Lu
272
10
0
24 Apr 2023
Implicit Temporal Modeling with Learnable Alignment for Video Recognition
IEEE International Conference on Computer Vision (ICCV), 2023
S. Tu
Jingdong Sun
Zuxuan Wu
Zhi-Qi Cheng
Hang-Rui Hu
Yu-Gang Jiang
313
59
0
20 Apr 2023
Pretrained Language Models as Visual Planners for Human Assistance
IEEE International Conference on Computer Vision (ICCV), 2023
Dhruvesh Patel
H. Eghbalzadeh
Nitin Kamra
Michael L. Iuzzolino
Unnat Jain
Ruta Desai
LM&Ro
355
35
0
17 Apr 2023
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision
International Conference on Learning Representations (ICLR), 2023
Jiani Huang
Ziyang Li
Mayur Naik
Ser-Nam Lim
678
9
0
15 Apr 2023
Zoom-VQA: Patches, Frames and Clips Integration for Video Quality Assessment
Kai Zhao
Kun Yuan
Ming-Ting Sun
Xingsen Wen
184
29
0
13 Apr 2023
AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection
Wentao Zhu
Yufang Huang
Xi Xie
Wenxian Liu
Jincan Deng
Debing Zhang
Zinan Lin
Ji Liu
320
22
0
12 Apr 2023
Scallop: A Language for Neurosymbolic Programming
Ziyang Li
Jiani Huang
Mayur Naik
ReLM
LRM
NAI
218
58
0
10 Apr 2023
Hyperspectral Image Super-Resolution via Dual-domain Network Based on Hybrid Convolution
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Tingting Liu
Yuan Liu
Chun-liang Zhang
Liyin Yuan
Xiubao Sui
Qian Chen
SupR
546
52
0
10 Apr 2023
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens
International Conference on Learning Representations (ICLR), 2023
Ziteng Gao
Zhan Tong
Limin Wang
Mike Zheng Shou
181
16
0
07 Apr 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Computer Vision and Pattern Recognition (CVPR), 2023
Syed Talal Wasim
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
M. Shah
VLM
VPVLM
232
112
0
06 Apr 2023
Sketch-based Video Object Localization
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Sangmin Woo
So-Yeong Jeon
Jinyoung Park
Minji Son
Sumin Lee
Changick Kim
438
0
0
02 Apr 2023
DOAD: Decoupled One Stage Action Detection Network
Shuning Chang
Pichao Wang
Fan Wang
Jiashi Feng
Mike Zheng Show
194
6
0
01 Apr 2023
Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations
Computer Vision and Pattern Recognition (CVPR), 2023
Yiwu Zhong
Licheng Yu
Yang Bai
Shangwen Li
Xueting Yan
Yin Li
AI4TS
253
47
0
31 Mar 2023
Streaming Video Model
Computer Vision and Pattern Recognition (CVPR), 2023
Yucheng Zhao
Chong Luo
Chuanxin Tang
DongDong Chen
Noel Codella
Zhengjun Zha
252
20
0
30 Mar 2023
What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
Computer Vision and Pattern Recognition (CVPR), 2023
Brian Chen
Nina Shvetsova
Andrew Rouditchenko
D. Kondermann
Samuel Thomas
Shih-Fu Chang
Rogerio Feris
James R. Glass
Hilde Kuehne
363
9
0
29 Mar 2023
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
195
3
0
28 Mar 2023
Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling
Computer Vision and Pattern Recognition (CVPR), 2023
Ryo Hachiuma
Fumiaki Sato
Taiki Sekii
3DPC
220
47
0
27 Mar 2023
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
Computer Vision and Pattern Recognition (CVPR), 2023
Davide Moltisanti
Frank Keller
Hakan Bilen
Laura Sevilla-Lara
309
8
0
27 Mar 2023
A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition
IEEE International Conference on Computer Vision (ICCV), 2023
Andong Deng
Taojiannan Yang
Chong Chen
AI4TS
244
18
0
23 Mar 2023
Natural Language-Assisted Sign Language Recognition
Computer Vision and Pattern Recognition (CVPR), 2023
Ronglai Zuo
Fangyun Wei
Brian Mak
SLR
230
81
0
21 Mar 2023
Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization
IEEE International Conference on Computer Vision (ICCV), 2023
Fida Mohammad Thoker
Hazel Doughty
Cees G. M. Snoek
ViT
348
12
0
20 Mar 2023
Dual-path Adaptation from Image to Video Transformers
Computer Vision and Pattern Recognition (CVPR), 2023
Jungin Park
Jiyoung Lee
Kwanghoon Sohn
ViT
250
57
0
17 Mar 2023
Video Action Recognition with Attentive Semantic Units
IEEE International Conference on Computer Vision (ICCV), 2023
Yifei Chen
Dapeng Chen
Ruijin Liu
Hao Li
Wei Peng
223
17
0
17 Mar 2023
CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective
Computer Vision and Pattern Recognition (CVPR), 2023
Jun Xiong
Gang Wang
Peng Zhang
Wei Huang
Yufei Zha
Guangtao Zhai
164
19
0
11 Mar 2023
TQ-Net: Mixed Contrastive Representation Learning For Heterogeneous Test Questions
He Zhu
Xihua Li
Xuemin Zhao
Yunbo Cao
Shan Yu
154
0
0
09 Mar 2023
Improving Video Retrieval by Adaptive Margin
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021
Feng He
Qi Wang
Zhifan Feng
Wenbin Jiang
Yajuan Lü
Yong Zhu
Xiao Tan
295
24
0
09 Mar 2023
Text-Visual Prompting for Efficient 2D Temporal Video Grounding
Computer Vision and Pattern Recognition (CVPR), 2023
Yimeng Zhang
Xin Chen
Jinghan Jia
Sijia Liu
Ke Ding
274
31
0
09 Mar 2023
Continuity-Aware Latent Interframe Information Mining for Reliable UAV Tracking
IEEE International Conference on Robotics and Automation (ICRA), 2023
Changhong Fu
Mutian Cai
Sihang Li
Kunhan Lu
Haobo Zuo
Chongjun Liu
250
8
0
08 Mar 2023
Continuous Sign Language Recognition with Correlation Network
Computer Vision and Pattern Recognition (CVPR), 2023
Lianyu Hu
Liqing Gao
Zekang Liu
Wei Feng
SLR
363
116
0
06 Mar 2023
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition
International Conference on Learning Representations (ICLR), 2023
Junyan Wang
Zhenhong Sun
Yichen Qian
Dong Gong
Xiuyu Sun
Ming Lin
Maurice Pagnucco
Yang Song
3DPC
199
14
0
05 Mar 2023
Temporal Coherent Test-Time Optimization for Robust Video Classification
International Conference on Learning Representations (ICLR), 2023
Chenyu Yi
Siyuan Yang
Yufei Wang
Haoliang Li
Yap-Peng Tan
Alex C. Kot
TTA
218
16
0
28 Feb 2023
Contrastive Video Question Answering via Video Graph Transformer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Junbin Xiao
Pan Zhou
Angela Yao
Yicong Li
Richang Hong
Shuicheng Yan
Tat-Seng Chua
ViT
252
52
0
27 Feb 2023
Deep Learning for Video-Text Retrieval: a Review
International Journal of Multimedia Information Retrieval (IJMIR), 2023
Cunjuan Zhu
Qi Jia
Wei Chen
Yanming Guo
Yu Liu
230
31
0
24 Feb 2023
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
AAAI Conference on Artificial Intelligence (AAAI), 2023
Weihong Zhong
Mao Zheng
Duyu Tang
Xuan Luo
Heng Gong
Xiaocheng Feng
Bing Qin
390
9
0
20 Feb 2023
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer
Scientific Reports (Sci Rep), 2023
N. H. Phong
B. Ribeiro
283
22
0
17 Feb 2023
CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection
C. Nwoye
Tong Yu
Saurav Sharma
Aditya Murali
Deepak Alapatt
...
Pietro Mascagni
B. Seeliger
Cristians Gonzalez
Didier Mutter
N. Padoy
256
36
0
13 Feb 2023
Efficient End-to-End Video Question Answering with Pyramidal Multimodal Transformer
AAAI Conference on Artificial Intelligence (AAAI), 2023
Min Peng
Chongyang Wang
Yu Shi
Xiang-Dong Zhou
ViT
246
12
0
04 Feb 2023
Learning Large-scale Neural Fields via Context Pruned Meta-Learning
Neural Information Processing Systems (NeurIPS), 2023
Jihoon Tack
Subin Kim
Sihyun Yu
Jaeho Lee
Jinwoo Shin
Jonathan Richard Schwarz
311
14
0
01 Feb 2023
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yizhen Chen
Jie Wang
Lijian Lin
Chen Ma
Jin Ma
Ying Shan
VLM
257
34
0
30 Jan 2023
Semi-Parametric Video-Grounded Text Generation
Sungdong Kim
Jin-Hwa Kim
Jiyoung Lee
Minjoon Seo
VGen
250
17
0
27 Jan 2023
Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Computer Vision and Pattern Recognition (CVPR), 2023
Ruyang Liu
Jingjia Huang
Ge Li
Jiashi Feng
Xing Wu
Thomas H. Li
AI4TS
CLIP
VLM
262
75
0
26 Jan 2023
Gated-ViGAT: Efficient Bottom-Up Event Recognition and Explanation Using a New Frame Selection Policy and Gating Mechanism
IEEE International Symposium on Multimedia (ISM), 2022
Nikolaos Gkalelis
Dimitrios Daskalakis
Vasileios Mezaris
155
5
0
18 Jan 2023
Previous
1
2
3
4
5
...
12
13
14
Next
Page 4 of 14
Page
of 14
Go