ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.11248
  4. Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition

A Closer Look at Spatiotemporal Convolutions for Action Recognition

30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
ArXivPDFHTML

Papers citing "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

50 / 1,270 papers shown
Title
Video Understanding as Machine Translation
Bruno Korbar
Fabio Petroni
Rohit Girdhar
Lorenzo Torresani
SSL
20
29
0
12 Jun 2020
Telling Left from Right: Learning Spatial Correspondence of Sight and
  Sound
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound
Karren D. Yang
Bryan C. Russell
Justin Salamon
SSL
24
75
0
11 Jun 2020
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local
  Module for Action Recognition
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
Yuecong Xu
Haozhi Cao
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
18
5
0
09 Jun 2020
ARID: A New Dataset for Recognizing Action in the Dark
ARID: A New Dataset for Recognizing Action in the Dark
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
27
71
0
06 Jun 2020
Temporal Aggregate Representations for Long-Range Video Understanding
Temporal Aggregate Representations for Long-Range Video Understanding
Fadime Sener
Dipika Singhania
Angela Yao
AI4TS
30
7
0
01 Jun 2020
In the Eye of the Beholder: Gaze and Actions in First Person Video
In the Eye of the Beholder: Gaze and Actions in First Person Video
Yin Li
Miao Liu
James M. Rehg
EgoV
30
69
0
31 May 2020
DJEnsemble: On the Selection of a Disjoint Ensemble of Deep Learning
  Black-Box Spatio-Temporal Models
DJEnsemble: On the Selection of a Disjoint Ensemble of Deep Learning Black-Box Spatio-Temporal Models
Y. M. Souto
R. S. Pereira
Rocío Zorrilla
A. Silva
Brian Tsan
Florin Rusu
Eduardo S. Ogasawara
A. Ziviani
Fábio Porto
11
1
0
22 May 2020
Deep learning with 4D spatio-temporal data representations for OCT-based
  force estimation
Deep learning with 4D spatio-temporal data representations for OCT-based force estimation
N. Gessert
M. Bengs
M. Schlüter
Alexander Schlaefer
23
23
0
20 May 2020
Preterm infants' pose estimation with spatio-temporal features
Preterm infants' pose estimation with spatio-temporal features
S. Moccia
Lucia Migliorelli
V. Carnielli
Emanuele Frontoni
3DH
25
44
0
08 May 2020
Exploiting Inter-Frame Regional Correlation for Efficient Action
  Recognition
Exploiting Inter-Frame Regional Correlation for Efficient Action Recognition
Yuecong Xu
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
8
11
0
06 May 2020
Adaptive Interaction Modeling via Graph Operations Search
Adaptive Interaction Modeling via Graph Operations Search
Haoxin Li
Weishi Zheng
Yu Tao
Haifeng Hu
Jianhuang Lai
26
5
0
05 May 2020
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video
Antonino Furnari
G. Farinella
EgoV
24
139
0
04 May 2020
Towards Visually Explaining Video Understanding Networks with
  Perturbation
Towards Visually Explaining Video Understanding Networks with Perturbation
Zhenqiang Li
Weimin Wang
Zuoyue Li
Yifei Huang
Yoichi Sato
FAtt
25
3
0
01 May 2020
Audio-Visual Instance Discrimination with Cross-Modal Agreement
Audio-Visual Instance Discrimination with Cross-Modal Agreement
Pedro Morgado
Nuno Vasconcelos
Ishan Misra
SSL
33
270
0
27 Apr 2020
Gabriella: An Online System for Real-Time Activity Detection in
  Untrimmed Security Videos
Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos
Mamshad Nayeem Rizve
Ugur Demir
Praveen Tirupattur
A. J. Rana
Kevin Duarte
Ishan R. Dave
Yogesh S Rawat
M. Shah
13
19
0
23 Apr 2020
MER-GCN: Micro Expression Recognition Based on Relation Modeling with
  Graph Convolutional Network
MER-GCN: Micro Expression Recognition Based on Relation Modeling with Graph Convolutional Network
Ling Lo
Hongxia Xie
Hong-Han Shuai
Wen-Huang Cheng
6
71
0
19 Apr 2020
Knowledge Distillation for Action Anticipation via Label Smoothing
Knowledge Distillation for Action Anticipation via Label Smoothing
Guglielmo Camporese
Pasquale Coscia
Antonino Furnari
G. Farinella
Lamberto Ballan
EgoV
40
36
0
16 Apr 2020
Asynchronous Interaction Aggregation for Action Detection
Asynchronous Interaction Aggregation for Action Detection
Jiajun Tang
Jinchao Xia
Xinzhi Mu
Bo Pang
Cewu Lu
36
119
0
16 Apr 2020
FineGym: A Hierarchical Video Dataset for Fine-grained Action
  Understanding
FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding
Dian Shao
Yue Zhao
Bo Dai
Dahua Lin
9
321
0
14 Apr 2020
ASL Recognition with Metric-Learning based Lightweight Network
ASL Recognition with Metric-Learning based Lightweight Network
Evgeny Izutov
SLR
16
6
0
10 Apr 2020
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Yizhou Zhou
Xiaoyan Sun
Chong Luo
Zhengjun Zha
Wenjun Zeng
3DPC
19
20
0
10 Apr 2020
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?
Hirokatsu Kataoka
Tenga Wakamiya
Kensho Hara
Y. Satoh
3DPC
31
87
0
10 Apr 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
75
1,001
0
09 Apr 2020
Temporal Pyramid Network for Action Recognition
Temporal Pyramid Network for Action Recognition
Ceyuan Yang
Yinghao Xu
Jianping Shi
Bo Dai
Bolei Zhou
20
367
0
07 Apr 2020
When, Where, and What? A New Dataset for Anomaly Detection in Driving
  Videos
When, Where, and What? A New Dataset for Anomaly Detection in Driving Videos
Yu Yao
Xizi Wang
Mingze Xu
Zelin Pu
E. Atkins
David J. Crandall
29
44
0
06 Apr 2020
Two-Stream AMTnet for Action Detection
Two-Stream AMTnet for Action Detection
Suman Saha
Gurkirt Singh
Fabio Cuzzolin
ViT
17
13
0
03 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
TEA: Temporal Excitation and Aggregation for Action Recognition
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
37
439
0
03 Apr 2020
BosphorusSign22k Sign Language Recognition Dataset
BosphorusSign22k Sign Language Recognition Dataset
Ogulcan Özdemir
A. Kındıroglu
Necati Cihan Camgöz
L. Akarun
15
38
0
02 Apr 2020
Knowing What, Where and When to Look: Efficient Video Action Modeling
  with Attention
Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention
Juan-Manuel Perez-Rua
Brais Martínez
Xiatian Zhu
Antoine Toisoul
Victor Escorcia
Tao Xiang
48
19
0
02 Apr 2020
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
  Recognition
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
Ziyu Liu
Hongwen Zhang
Zhenghao Chen
Zhiyong Wang
Wanli Ouyang
32
817
0
31 Mar 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
33
54
0
30 Mar 2020
Omni-sourced Webly-supervised Learning for Video Recognition
Omni-sourced Webly-supervised Learning for Video Recognition
Haodong Duan
Yue Zhao
Yuanjun Xiong
Wentao Liu
Dahua Lin
VLM
23
88
0
29 Mar 2020
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks
Qihang Yu
Yingwei Li
Jieru Mei
Yuyin Zhou
Alan Yuille
3DPC
28
3
0
28 Mar 2020
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Lu Wang
Dongxue Liang
Xiao-Lei Yin
Jing Qiu
Zhi-Yun Yang
Jun-Hui Xing
Jian-Zeng Dong
Zhao-Yuan Ma
MedIm
21
0
0
26 Mar 2020
Temporally Coherent Embeddings for Self-Supervised Video Representation
  Learning
Temporally Coherent Embeddings for Self-Supervised Video Representation Learning
Joshua Knights
Ben Harwood
Daniel Ward
Anthony Vanderkop
Olivia Mackenzie-Ross
Peyman Moghadam
AI4TS
20
38
0
21 Mar 2020
STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition
STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition
Xu Li
Jingwen Wang
Lin Ma
Kaihao Zhang
Fengzong Lian
Zhanhui Kang
Jinjun Wang
20
5
0
18 Mar 2020
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving
  Based on Bird's Eye View Maps
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps
Pengxiang Wu
Siheng Chen
Dimitris N. Metaxas
3DPC
14
154
0
15 Mar 2020
On Compositions of Transformations in Contrastive Self-Supervised
  Learning
On Compositions of Transformations in Contrastive Self-Supervised Learning
Mandela Patrick
Yuki M. Asano
Polina Kuznetsova
Ruth C. Fong
João F. Henriques
Geoffrey Zweig
Andrea Vedaldi
23
49
0
09 Mar 2020
Self-Supervised Visual Learning by Variable Playback Speeds Prediction
  of a Video
Self-Supervised Visual Learning by Variable Playback Speeds Prediction of a Video
Hyeon Cho
Taehoon Kim
H. Chang
Wonjun Hwang
20
19
0
05 Mar 2020
TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for
  Real-time Video Facial Expression Recognition
TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for Real-time Video Facial Expression Recognition
J. Lee
A. Wong
CVBM
16
14
0
03 Mar 2020
Rethinking Zero-shot Video Classification: End-to-end Training for
  Realistic Applications
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
137
127
0
03 Mar 2020
VideoSSL: Semi-Supervised Learning for Video Classification
VideoSSL: Semi-Supervised Learning for Video Classification
Longlong Jing
T. Parag
Zhe Wu
Yingli Tian
Hongcheng Wang
24
50
0
29 Feb 2020
Infrared and 3D skeleton feature fusion for RGB-D action recognition
Infrared and 3D skeleton feature fusion for RGB-D action recognition
Alban Main De Boissiere
R. Noumeir
25
38
0
28 Feb 2020
Evolving Losses for Unsupervised Video Representation Learning
Evolving Losses for Unsupervised Video Representation Learning
A. Piergiovanni
A. Angelova
Michael S. Ryoo
SSL
27
138
0
26 Feb 2020
Hierarchical Conditional Relation Networks for Video Question Answering
Hierarchical Conditional Relation Networks for Video Question Answering
T. Le
Vuong Le
Svetha Venkatesh
T. Tran
14
258
0
25 Feb 2020
Bottom-Up Temporal Action Localization with Mutual Regularization
Bottom-Up Temporal Action Localization with Mutual Regularization
Peisen Zhao
Lingxi Xie
Chen Ju
Ya Zhang
Yanfeng Wang
Qi Tian
12
1
0
18 Feb 2020
A Survey on 3D Skeleton-Based Action Recognition Using Learning Method
A Survey on 3D Skeleton-Based Action Recognition Using Learning Method
Bin Ren
Mengyuan Liu
Runwei Ding
Hong Liu
27
121
0
14 Feb 2020
Over-the-Air Adversarial Flickering Attacks against Video Recognition
  Networks
Over-the-Air Adversarial Flickering Attacks against Video Recognition Networks
Roi Pony
I. Naeh
Shie Mannor
AAML
21
51
0
12 Feb 2020
Two-Stream Aural-Visual Affect Analysis in the Wild
Two-Stream Aural-Visual Affect Analysis in the Wild
Felix Kuhnke
Lars Rumberg
Jörn Ostermann
CVBM
53
77
0
09 Feb 2020
Dynamic Inference: A New Approach Toward Efficient Video Action
  Recognition
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Yi Yang
Shilei Wen
24
35
0
09 Feb 2020
Previous
123...212223242526
Next