ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.11248
  4. Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition

A Closer Look at Spatiotemporal Convolutions for Action Recognition

30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
ArXivPDFHTML

Papers citing "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

50 / 1,270 papers shown
Title
Coarse-Fine Networks for Temporal Activity Detection in Videos
Coarse-Fine Networks for Temporal Activity Detection in Videos
Kumara Kahatapitiya
Michael S. Ryoo
AI4TS
55
38
0
01 Mar 2021
ACDnet: An action detection network for real-time edge computing based
  on flow-guided feature approximation and memory aggregation
ACDnet: An action detection network for real-time edge computing based on flow-guided feature approximation and memory aggregation
Yu Liu
Fan Yang
D. Ginhac
32
13
0
26 Feb 2021
Phase Space Reconstruction Network for Lane Intrusion Action Recognition
Phase Space Reconstruction Network for Lane Intrusion Action Recognition
Ruiwen Zhang
Zhidong Deng
Hongsen Lin
Hongchao Lu
30
0
0
22 Feb 2021
Improving Action Quality Assessment using Weighted Aggregation
Improving Action Quality Assessment using Weighted Aggregation
Shafkat Farabi
H. Himel
Fakhruddin Gazzali
Md. Bakhtiar Hasan
M. H. Kabir
M. Farazi
16
8
0
21 Feb 2021
DeeperForensics Challenge 2020 on Real-World Face Forgery Detection:
  Methods and Results
DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results
Liming Jiang
Z. Guo
Wayne Wu
Zhaoyang Liu
Ziwei Liu
...
Feiyue Huang
Liujuan Cao
Rongrong Ji
Changlei Lu
Ganchao Tan
CVBM
35
11
0
18 Feb 2021
Learning to Recognize Actions on Objects in Egocentric Video with
  Attention Dictionaries
Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
29
15
0
16 Feb 2021
VA-RED$^2$: Video Adaptive Redundancy Reduction
VA-RED2^22: Video Adaptive Redundancy Reduction
Bowen Pan
Yikang Shen
Camilo Luciano Fosco
Chung-Ching Lin
A. Andonian
Yue Meng
Kate Saenko
A. Oliva
Rogerio Feris
20
19
0
15 Feb 2021
Learning Self-Similarity in Space and Time as Generalized Motion for
  Video Action Recognition
Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
TTA
27
39
0
14 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse
  Sampling
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
46
648
0
11 Feb 2021
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action
  Recognition
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
Yue Meng
Yikang Shen
Chung-Ching Lin
P. Sattigeri
Leonid Karlinsky
Kate Saenko
A. Oliva
Rogerio Feris
73
62
0
10 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
283
1,992
0
09 Feb 2021
U-LanD: Uncertainty-Driven Video Landmark Detection
U-LanD: Uncertainty-Driven Video Landmark Detection
Mohammad Jafari
C. Luong
Michael Y. Tsang
A. Gu
N. V. Woudenberg
R. Rohling
T. Tsang
Purang Abolmaesumi
42
12
0
02 Feb 2021
GCF-Net: Gated Clip Fusion Network for Video Action Recognition
GCF-Net: Gated Clip Fusion Network for Video Action Recognition
Jenhao Hsiao
Jiawei Chen
C. Ho
15
5
0
02 Feb 2021
VX2TEXT: End-to-End Learning of Video-Based Text Generation From
  Multimodal Inputs
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
Xudong Lin
Gedas Bertasius
Jue Wang
Shih-Fu Chang
Devi Parikh
Lorenzo Torresani
VGen
33
66
0
28 Jan 2021
Evolutionary Multi-objective Architecture Search Framework: Application
  to COVID-19 3D CT Classification
Evolutionary Multi-objective Architecture Search Framework: Application to COVID-19 3D CT Classification
Xinfu He
Guohao Ying
Jiyong Zhang
Xiaowen Chu
3DPC
MedIm
18
8
0
26 Jan 2021
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Mike Zheng Shou
Stan Weixian Lei
Weiyao Wang
Deepti Ghadiyaram
Matt Feiszli
VOS
93
76
0
26 Jan 2021
3D U-Net for segmentation of COVID-19 associated pulmonary infiltrates
  using transfer learning: State-of-the-art results on affordable hardware
3D U-Net for segmentation of COVID-19 associated pulmonary infiltrates using transfer learning: State-of-the-art results on affordable hardware
Keno K. Bressem
S. Niehues
B. Hamm
Marcus R. Makowski
J. Vahldiek
Lisa Christine Adams
19
10
0
25 Jan 2021
Bridging the gap between Human Action Recognition and Online Action
  Detection
Bridging the gap between Human Action Recognition and Online Action Detection
Alban Main De Boissiere
R. Noumeir
22
0
0
21 Jan 2021
TCLR: Temporal Contrastive Learning for Video Representation
TCLR: Temporal Contrastive Learning for Video Representation
I. Dave
Rohit Gupta
Mamshad Nayeem Rizve
Mubarak Shah
SSL
AI4TS
36
175
0
20 Jan 2021
Coarse Temporal Attention Network (CTA-Net) for Driver's Activity
  Recognition
Coarse Temporal Attention Network (CTA-Net) for Driver's Activity Recognition
Zachary Wharton
Ardhendu Behera
Yonghuai Liu
Nikolaos Bessis
39
35
0
17 Jan 2021
Exploration of Visual Features and their weighted-additive fusion for
  Video Captioning
Exploration of Visual Features and their weighted-additive fusion for Video Captioning
V. PraveenS.
Akhilesh Bharadwaj
Harsh Raj
Janhavi Dadhania
Ganesh Samarth C.A
Nikhil Pareek
S. M. I. S. R. Mahadeva Prasanna
35
1
0
14 Jan 2021
Automated Model Design and Benchmarking of 3D Deep Learning Models for
  COVID-19 Detection with Chest CT Scans
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
Xin He
Shihao Wang
Xiaowen Chu
S. Shi
J. Tang
Xin Liu
C. Yan
Jiyong Zhang
G. Ding
3DPC
OOD
30
38
0
14 Jan 2021
Learning from Weakly-labeled Web Videos via Exploring Sub-Concepts
Learning from Weakly-labeled Web Videos via Exploring Sub-Concepts
Kunpeng Li
Zizhao Zhang
Guanhang Wu
Xuehan Xiong
Chen-Yu Lee
Zhichao Lu
Y. Fu
Tomas Pfister
34
5
0
11 Jan 2021
Temporal Contrastive Graph Learning for Video Action Recognition and
  Retrieval
Temporal Contrastive Graph Learning for Video Action Recognition and Retrieval
Yang Liu
Keze Wang
Haoyuan Lan
Liang Lin
32
16
0
04 Jan 2021
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action
  Localization
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization
Ashraful Islam
Chengjiang Long
Richard J. Radke
30
123
0
03 Jan 2021
Refining activation downsampling with SoftPool
Refining activation downsampling with SoftPool
Alexandros Stergiou
R. Poppe
Grigorios Kalliatakis
34
159
0
02 Jan 2021
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video
  Recognition
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
Hengduo Li
Zuxuan Wu
Abhinav Shrivastava
L. Davis
27
35
0
29 Dec 2020
Tensor Representations for Action Recognition
Tensor Representations for Action Recognition
Piotr Koniusz
Lei Wang
A. Cherian
41
69
0
28 Dec 2020
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the
  UDIVA Dataset
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset
Cristina Palmero
Javier Selva
Sorina Smeureanu
Julio C. S. Jacques Junior
Albert Clapés
...
Zejian Zhang
D. Gallardo-Pujol
G. Guilera
D. Leiva
Sergio Escalera
30
53
0
28 Dec 2020
Learning Inter- and Intraframe Representations for Non-Lambertian
  Photometric Stereo
Learning Inter- and Intraframe Representations for Non-Lambertian Photometric Stereo
Yanlong Cao
Binjie Ding
Zewei He
Jiangxin Yang
Jingxi Chen
Yanpeng Cao
Xin Li
35
13
0
26 Dec 2020
Spatio-temporal Multi-task Learning for Cardiac MRI Left Ventricle
  Quantification
Spatio-temporal Multi-task Learning for Cardiac MRI Left Ventricle Quantification
Sulaiman Vesal
Mingxuan Gu
Andreas Maier
Nishant Ravikumar
25
18
0
24 Dec 2020
Human Action Recognition from Various Data Modalities: A Review
Human Action Recognition from Various Data Modalities: A Review
Zehua Sun
Qiuhong Ke
Hossein Rahmani
Mohammed Bennamoun
Gang Wang
Jun Liu
MU
56
504
0
22 Dec 2020
A Multi-View Dynamic Fusion Framework: How to Improve the Multimodal
  Brain Tumor Segmentation from Multi-Views?
A Multi-View Dynamic Fusion Framework: How to Improve the Multimodal Brain Tumor Segmentation from Multi-Views?
Yi Ding
Wei Zheng
Guozheng Wu
Ji Geng
Mingsheng Cao
Zhiguang Qin
25
1
0
21 Dec 2020
TDN: Temporal Difference Networks for Efficient Action Recognition
TDN: Temporal Difference Networks for Efficient Action Recognition
Limin Wang
Zhan Tong
Bin Ji
Gangshan Wu
28
391
0
18 Dec 2020
Multi-shot Temporal Event Localization: a Benchmark
Multi-shot Temporal Event Localization: a Benchmark
Xiaolong Liu
Yao Hu
S. Bai
Fei Ding
X. Bai
Philip Torr
51
82
0
17 Dec 2020
Smoothed Gaussian Mixture Models for Video Classification and
  Recommendation
Smoothed Gaussian Mixture Models for Video Classification and Recommendation
Sirjan Kafle
Aman Gupta
Xue Xia
A. Sankar
Xi Chen
Di Wen
Liang Zhang
21
0
0
17 Dec 2020
Unsupervised Behaviour Analysis and Magnification (uBAM) using Deep
  Learning
Unsupervised Behaviour Analysis and Magnification (uBAM) using Deep Learning
Biagio Brattoli
Uta Büchler
Michael Dorkenwald
Philipp Reiser
Linard Filli
F. Helmchen
A. Wahl
Bjorn Ommer
22
3
0
16 Dec 2020
Temporal Graph Modeling for Skeleton-based Action Recognition
Temporal Graph Modeling for Skeleton-based Action Recognition
Jianan Li
Xuemei Xie
Zhifu Zhao
Yuhan Cao
Qingzhe Pan
Guangming Shi
25
11
0
16 Dec 2020
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation
Tarun Kalluri
Deepak Pathak
Manmohan Chandraker
Du Tran
VGen
33
143
0
15 Dec 2020
GTA: Global Temporal Attention for Video Action Understanding
GTA: Global Temporal Attention for Video Action Understanding
Bo He
Xitong Yang
Zuxuan Wu
Hao Chen
Ser-Nam Lim
Abhinav Shrivastava
ViT
33
27
0
15 Dec 2020
NUTA: Non-uniform Temporal Aggregation for Action Recognition
NUTA: Non-uniform Temporal Aggregation for Action Recognition
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Hao Chen
Joseph Tighe
ViT
16
16
0
15 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
Driving Behavior Explanation with Multi-level Fusion
Driving Behavior Explanation with Multi-level Fusion
H. Ben-younes
Éloi Zablocki
Patrick Pérez
Matthieu Cord
27
30
0
09 Dec 2020
CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions
CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions
Tayfun Ates
Muhammed Samil Atesoglu
Cagatay Yigit
.Ilker Kesen
Mert Kobaş
Erkut Erdem
Aykut Erdem
T. Goksun
Deniz Yuret
27
31
0
08 Dec 2020
Rethinking movie genre classification with fine-grained semantic
  clustering
Rethinking movie genre classification with fine-grained semantic clustering
Edward Fish
Jon Weinbren
Andrew Gilbert
VLM
34
7
0
04 Dec 2020
Spatial-Temporal Alignment Network for Action Recognition and Detection
Spatial-Temporal Alignment Network for Action Recognition and Detection
Junwei Liang
Liangliang Cao
Xuehan Xiong
Ting Yu
Alexander G. Hauptmann
3DPC
18
9
0
04 Dec 2020
Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using
  Multi-modal Observations of Human-robot Handovers
Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-robot Handovers
Vladimir E. Iashin
Francesca Palermo
Gokhan Solak
Claudio Coppola
11
10
0
02 Dec 2020
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Haozheng Luo
Ruiyang Qin
Chenwei Xu
Guo Ye
Zening Luo
56
5
0
01 Dec 2020
Diverse Temporal Aggregation and Depthwise Spatiotemporal Factorization
  for Efficient Video Classification
Diverse Temporal Aggregation and Depthwise Spatiotemporal Factorization for Efficient Video Classification
Youngwan Lee
Hyungil Kim
Kimin Yun
Jinyoung Moon
26
12
0
01 Dec 2020
Video Self-Stitching Graph Network for Temporal Action Localization
Video Self-Stitching Graph Network for Temporal Action Localization
Chen Zhao
Ali K. Thabet
Guohao Li
26
138
0
30 Nov 2020
Previous
123...181920...242526
Next