ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.04851
  4. Cited By
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification
v1v2 (latest)

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

13 December 2017
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
    3DH
ArXiv (abs)PDFHTML

Papers citing "Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"

50 / 675 papers shown
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks
  on Mobile Devices
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile DevicesAAAI Conference on Artificial Intelligence (AAAI), 2020
Wei Niu
Mengshu Sun
Hao Sun
Jou-An Chen
Jiexiong Guan
Xipeng Shen
Yanzhi Wang
Sijia Liu
Xue Lin
Bin Ren
MQ
177
13
0
20 Jul 2020
Region-based Non-local Operation for Video Classification
Region-based Non-local Operation for Video ClassificationInternational Conference on Pattern Recognition (ICPR), 2020
Guoxi Huang
A. Bors
419
12
0
17 Jul 2020
Temporal Distinct Representation Learning for Action Recognition
Temporal Distinct Representation Learning for Action RecognitionEuropean Conference on Computer Vision (ECCV), 2020
Junwu Weng
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xudong Jiang
Junsong Yuan
126
28
0
15 Jul 2020
Alleviating Over-segmentation Errors by Detecting Action Boundaries
Alleviating Over-segmentation Errors by Detecting Action BoundariesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Yuchi Ishikawa
Seito Kasai
Y. Aoki
Hirokatsu Kataoka
169
170
0
14 Jul 2020
IntegralAction: Pose-driven Feature Integration for Robust Human Action
  Recognition in Videos
IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos
Gyeongsik Moon
Heeseung Kwon
Kyoung Mu Lee
Minsu Cho
157
30
0
13 Jul 2020
Universal-to-Specific Framework for Complex Action Recognition
Universal-to-Specific Framework for Complex Action RecognitionIEEE transactions on multimedia (TMM), 2020
Peisen Zhao
Lingxi Xie
Ya Zhang
Qi Tian
172
9
0
13 Jul 2020
Aligning Videos in Space and Time
Aligning Videos in Space and TimeEuropean Conference on Computer Vision (ECCV), 2020
Senthil Purushwalkam
Tian-Chun Ye
Saurabh Gupta
Abhinav Gupta
158
24
0
09 Jul 2020
Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet
Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet
Hao Chen
Abhinav Shrivastava
205
17
0
01 Jul 2020
Self-Supervised MultiModal Versatile Networks
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
389
400
0
29 Jun 2020
Dynamic Sampling Networks for Efficient Action Recognition in Videos
Dynamic Sampling Networks for Efficient Action Recognition in Videos
Yin-Dong Zheng
Zhaoyang Liu
Tong Lu
Limin Wang
138
83
0
28 Jun 2020
Counting Out Time: Class Agnostic Video Repetition Counting in the Wild
Counting Out Time: Class Agnostic Video Repetition Counting in the Wild
Debidatta Dwibedi
Y. Aytar
Jonathan Tompson
P. Sermanet
Andrew Zisserman
AI4TS
191
127
0
27 Jun 2020
Motion Representation Using Residual Frames with 3D CNN
Motion Representation Using Residual Frames with 3D CNN
Li Tao
Xueting Wang
T. Yamasaki
3DPC
132
2
0
21 Jun 2020
Melanoma Diagnosis with Spatio-Temporal Feature Learning on Sequential
  Dermoscopic Images
Melanoma Diagnosis with Spatio-Temporal Feature Learning on Sequential Dermoscopic Images
Zhen Yu
Jennifer Nguyen
Xiaojun Chang
J. Kelly
C. Mclean
Lei Zhang
Victoria Mar
Z. Ge
MedIm
109
4
0
19 Jun 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action
  Localization
Actor-Context-Actor Relation Network for Spatio-Temporal Action LocalizationComputer Vision and Pattern Recognition (CVPR), 2020
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Jiaming Song
3DPC
303
171
0
14 Jun 2020
DTG-Net: Differentiated Teachers Guided Self-Supervised Video Action
  Recognition
DTG-Net: Differentiated Teachers Guided Self-Supervised Video Action Recognition
Ziming Liu
Guangyu Gao
•. A. K. Qin
Jinyang Li
ViT
170
1
0
13 Jun 2020
Open-Narrow-Synechiae Anterior Chamber Angle Classification in AS-OCT
  Sequences
Open-Narrow-Synechiae Anterior Chamber Angle Classification in AS-OCT Sequences
Huaying Hao
Huazhu Fu
Yanwu Xu
Jianlong Yang
Fei Li
Xiulan Zhang
Jiang-Dong Liu
Yitian Zhao
351
8
0
09 Jun 2020
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local
  Module for Action Recognition
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
Yuecong Xu
Haozhi Cao
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
157
5
0
09 Jun 2020
ARID: A New Dataset for Recognizing Action in the Dark
ARID: A New Dataset for Recognizing Action in the Dark
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
153
82
0
06 Jun 2020
In the Eye of the Beholder: Gaze and Actions in First Person Video
In the Eye of the Beholder: Gaze and Actions in First Person VideoIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Yin Li
Miao Liu
James M. Rehg
EgoV
330
92
0
31 May 2020
Which scaling rule applies to Artificial Neural Networks
Which scaling rule applies to Artificial Neural Networks
János Végh
629
12
0
15 May 2020
Adaptive Interaction Modeling via Graph Operations Search
Adaptive Interaction Modeling via Graph Operations SearchComputer Vision and Pattern Recognition (CVPR), 2020
Haoxin Li
Weishi Zheng
Yu Tao
Haifeng Hu
Jianhuang Lai
145
6
0
05 May 2020
Beyond Instructional Videos: Probing for More Diverse Visual-Textual
  Grounding on YouTube
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTubeConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Jack Hessel
Zhenhai Zhu
Bo Pang
Radu Soricut
222
4
0
29 Apr 2020
Skeleton Focused Human Activity Recognition in RGB Video
Skeleton Focused Human Activity Recognition in RGB Video
Bruce X. B. Yu
Yan Liu
Keith C. C. Chan
201
5
0
29 Apr 2020
SpeedNet: Learning the Speediness in Videos
SpeedNet: Learning the Speediness in VideosComputer Vision and Pattern Recognition (CVPR), 2020
Sagie Benaim
Ariel Ephrat
Oran Lang
Inbar Mosseri
William T. Freeman
Michael Rubinstein
Michal Irani
Tali Dekel
236
277
0
13 Apr 2020
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Spatiotemporal Fusion in 3D CNNs: A Probabilistic ViewComputer Vision and Pattern Recognition (CVPR), 2020
Yizhou Zhou
Xiaoyan Sun
Chong Luo
Zhengjun Zha
Wenjun Zeng
3DPC
163
23
0
10 Apr 2020
Temporal Pyramid Network for Action Recognition
Temporal Pyramid Network for Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2020
Ceyuan Yang
Yinghao Xu
Jianping Shi
Bo Dai
Bolei Zhou
209
383
0
07 Apr 2020
Two-Stream AMTnet for Action Detection
Two-Stream AMTnet for Action Detection
Suman Saha
Gurkirt Singh
Fabio Cuzzolin
ViT
143
13
0
03 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
TEA: Temporal Excitation and Aggregation for Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2020
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
307
534
0
03 Apr 2020
RetinaTrack: Online Single Stage Joint Detection and Tracking
RetinaTrack: Online Single Stage Joint Detection and TrackingComputer Vision and Pattern Recognition (CVPR), 2020
Zhichao Lu
V. Rathod
Ronny Votel
Jonathan Huang
VOT
244
212
0
30 Mar 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Speech2Action: Cross-modal Supervision for Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2020
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
153
59
0
30 Mar 2020
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D NetworksAAAI Conference on Artificial Intelligence (AAAI), 2020
Qihang Yu
Yingwei Li
Jieru Mei
Yuyin Zhou
Alan Yuille
3DPC
204
3
0
28 Mar 2020
STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition
STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition
Xu Li
Jingwen Wang
Lin Ma
Kaihao Zhang
Fengzong Lian
Zhanhui Kang
Jinjun Wang
129
5
0
18 Mar 2020
GraphTCN: Spatio-Temporal Interaction Modeling for Human Trajectory
  Prediction
GraphTCN: Spatio-Temporal Interaction Modeling for Human Trajectory PredictionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Chengxin Wang
Shaofeng Cai
Gary S. H. Tan
GNN
319
67
0
16 Mar 2020
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving
  Based on Bird's Eye View Maps
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View MapsComputer Vision and Pattern Recognition (CVPR), 2020
Pengxiang Wu
Siheng Chen
Dimitris N. Metaxas
3DPC
220
178
0
15 Mar 2020
On Compositions of Transformations in Contrastive Self-Supervised
  Learning
On Compositions of Transformations in Contrastive Self-Supervised LearningIEEE International Conference on Computer Vision (ICCV), 2020
Mandela Patrick
Yuki M. Asano
Polina Kuznetsova
Ruth C. Fong
João F. Henriques
Geoffrey Zweig
Andrea Vedaldi
228
53
0
09 Mar 2020
Omni-Scale CNNs: a simple and effective kernel size configuration for
  time series classification
Omni-Scale CNNs: a simple and effective kernel size configuration for time series classificationInternational Conference on Learning Representations (ICLR), 2020
Wensi Tang
Guodong Long
Lu Liu
Wanrong Zhu
Michael Blumenstein
Jing Jiang
AI4TS
459
145
0
24 Feb 2020
Fine-Grained Instance-Level Sketch-Based Video Retrieval
Fine-Grained Instance-Level Sketch-Based Video Retrieval
Peng Xu
Kun Liu
Tao Xiang
Timothy M. Hospedales
Zhanyu Ma
Jun Guo
Yi-Zhe Song
238
37
0
21 Feb 2020
Knowledge Integration Networks for Action Recognition
Knowledge Integration Networks for Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2020
Shiwen Zhang
Sheng Guo
Limin Wang
Weilin Huang
Matthew R. Scott
198
20
0
18 Feb 2020
V4D:4D Convolutional Neural Networks for Video-level Representation
  Learning
V4D:4D Convolutional Neural Networks for Video-level Representation LearningInternational Conference on Learning Representations (ICLR), 2020
Shiwen Zhang
Sheng Guo
Weilin Huang
Matthew R. Scott
Limin Wang
130
78
0
18 Feb 2020
Bottom-Up Temporal Action Localization with Mutual Regularization
Bottom-Up Temporal Action Localization with Mutual Regularization
Peisen Zhao
Lingxi Xie
Chen Ju
Ya Zhang
Yanfeng Wang
Qi Tian
161
1
0
18 Feb 2020
Dual-Attention GAN for Large-Pose Face Frontalization
Dual-Attention GAN for Large-Pose Face FrontalizationIEEE International Conference on Automatic Face & Gesture Recognition (FG), 2020
Yu Yin
Songyao Jiang
Joseph P. Robinson
Y. Fu
CVBM
194
63
0
17 Feb 2020
UniVL: A Unified Video and Language Pre-Training Model for Multimodal
  Understanding and Generation
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao Luo
Lei Ji
Ding Wang
Haoyang Huang
Nan Duan
Tianrui Li
Jason Li
Xilin Chen
Ming Zhou
VLM
365
418
0
15 Feb 2020
Dynamic Inference: A New Approach Toward Efficient Video Action
  Recognition
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Yi Yang
Shilei Wen
158
37
0
09 Feb 2020
CTM: Collaborative Temporal Modeling for Action Recognition
CTM: Collaborative Temporal Modeling for Action Recognition
Li-Yu Daisy Liu
Tao Wang
Jie Liu
Yang Guan
Qi Bu
Longfei Yang
TTA
93
0
0
08 Feb 2020
Symbiotic Attention with Privileged Information for Egocentric Action
  Recognition
Symbiotic Attention with Privileged Information for Egocentric Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2020
Xiaohan Wang
Yu Wu
Linchao Zhu
Yi Yang
181
65
0
08 Feb 2020
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
591
230
0
23 Jan 2020
MixTConv: Mixed Temporal Convolutional Kernels for Efficient Action
  Recogntion
MixTConv: Mixed Temporal Convolutional Kernels for Efficient Action Recogntion
Kaiyu Shan
Yongtao Wang
Zhuoying Wang
Tingting Liang
Zhi Tang
Ying-Cong Chen
Yangyan Li
AI4TS
136
4
0
19 Jan 2020
Temporal Interlacing Network
Temporal Interlacing NetworkAAAI Conference on Artificial Intelligence (AAAI), 2020
Hao Shao
Shengju Qian
Yu Liu
230
108
0
17 Jan 2020
Rethinking Motion Representation: Residual Frames with 3D ConvNets for
  Better Action Recognition
Rethinking Motion Representation: Residual Frames with 3D ConvNets for Better Action RecognitionIEEE Transactions on Image Processing (TIP), 2020
Li Tao
Xueting Wang
T. Yamasaki
3DPC
136
26
0
16 Jan 2020
Spatial-Spectral Residual Network for Hyperspectral Image
  Super-Resolution
Spatial-Spectral Residual Network for Hyperspectral Image Super-Resolution
Qi. Wang
Qiang Li
Xuelong Li
SupR
151
34
0
14 Jan 2020
Previous
123...11121314
Next