ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.11248
  4. Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition

A Closer Look at Spatiotemporal Convolutions for Action Recognition

30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
ArXivPDFHTML

Papers citing "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

50 / 1,270 papers shown
Title
Temporal Query Networks for Fine-grained Video Understanding
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
24
83
0
19 Apr 2021
What can human minimal videos tell us about dynamic recognition models?
What can human minimal videos tell us about dynamic recognition models?
Guy Ben-Yosef
Gabriel Kreiman
S. Ullman
24
2
0
19 Apr 2021
Writing in The Air: Unconstrained Text Recognition from Finger Movement
  Using Spatio-Temporal Convolution
Writing in The Air: Unconstrained Text Recognition from Finger Movement Using Spatio-Temporal Convolution
Ue-Hwan Kim
Yewon Hwang
Sun-Kyung Lee
Jong-Hwan Kim
33
19
0
19 Apr 2021
Higher Order Recurrent Space-Time Transformer for Video Action
  Prediction
Higher Order Recurrent Space-Time Transformer for Video Action Prediction
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Oswald Lanz
41
9
0
17 Apr 2021
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Ioana Croitoru
Simion-Vlad Bogolin
Marius Leordeanu
Hailin Jin
Andrew Zisserman
Samuel Albanie
Yang Liu
VGen
21
124
0
16 Apr 2021
Adaptive Intermediate Representations for Video Understanding
Adaptive Intermediate Representations for Video Understanding
Juhana Kangaspunta
A. Piergiovanni
Rico Jonschkowski
Michael S. Ryoo
A. Angelova
26
3
0
14 Apr 2021
Temporally-Aware Feature Pooling for Action Spotting in Soccer
  Broadcasts
Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts
Silvio Giancola
Guohao Li
33
45
0
14 Apr 2021
ADNet: Temporal Anomaly Detection in Surveillance Videos
ADNet: Temporal Anomaly Detection in Surveillance Videos
H. Öztürk
Ahmet Burak Can
27
15
0
14 Apr 2021
Unidentified Video Objects: A Benchmark for Dense, Open-World
  Segmentation
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Weiyao Wang
Matt Feiszli
Heng Wang
Du Tran
VOS
15
123
0
10 Apr 2021
Video-aided Unsupervised Grammar Induction
Video-aided Unsupervised Grammar Induction
Songyang Zhang
Linfeng Song
Lifeng Jin
Kun Xu
Dong Yu
Jiebo Luo
24
26
0
09 Apr 2021
Progressive Temporal Feature Alignment Network for Video Inpainting
Progressive Temporal Feature Alignment Network for Video Inpainting
Xueyan Zou
Linjie Yang
Ding Liu
Yong Jae Lee
19
56
0
08 Apr 2021
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal
  Action Localization
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action Localization
Sanqing Qu
Guang Chen
Zhijun Li
Lijun Zhang
Fan Lu
Alois C. Knoll
17
54
0
07 Apr 2021
Zeus: Efficiently Localizing Actions in Videos using Reinforcement
  Learning
Zeus: Efficiently Localizing Actions in Videos using Reinforcement Learning
Pramod Chunduri
J. Bang
Yao Lu
Joy Arulraj
24
11
0
06 Apr 2021
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin Heo
Y. Choi
Young-Woon Lee
Byung-Gyu Kim
ViT
17
55
0
03 Apr 2021
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative
  Memories
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Xitong Yang
Haoqi Fan
Lorenzo Torresani
L. Davis
Heng Wang
VLM
27
20
0
02 Apr 2021
Self-supervised Video Representation Learning by Context and Motion
  Decoupling
Self-supervised Video Representation Learning by Context and Motion Decoupling
Lianghua Huang
Yu Liu
Bin Wang
Pan Pan
Yinghui Xu
Rong Jin
SSL
49
51
0
02 Apr 2021
Multiview Pseudo-Labeling for Semi-supervised Learning from Video
Multiview Pseudo-Labeling for Semi-supervised Learning from Video
Bo Xiong
Haoqi Fan
Kristen Grauman
Christoph Feichtenhofer
SSL
24
49
0
01 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
57
1,134
0
01 Apr 2021
Motion Guided Attention Fusion to Recognize Interactions from Videos
Motion Guided Attention Fusion to Recognize Interactions from Videos
Tae Soo Kim
Jonathan D. Jones
Gregory Hager
24
15
0
01 Apr 2021
Self-supervised Motion Learning from Static Images
Self-supervised Motion Learning from Static Images
Ziyuan Huang
Shiwei Zhang
Jianwen Jiang
Mingqian Tang
Rong Jin
M. Ang
SSL
26
29
0
01 Apr 2021
Adaptive Configuration of In Situ Lossy Compression for Cosmology
  Simulations via Fine-Grained Rate-Quality Modeling
Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling
Sian Jin
Jesus Pulido
Pascal Grosset
Jiannan Tian
Dingwen Tao
J. Ahrens
33
22
0
01 Apr 2021
Learning by Aligning Videos in Time
Learning by Aligning Videos in Time
S. Haresh
Sateesh Kumar
Huseyin Coskun
S. N. Syed
Andrey Konin
M. Zia
Quoc-Huy Tran
AI4TS
29
64
0
31 Mar 2021
Learning Representational Invariances for Data-Efficient Action
  Recognition
Learning Representational Invariances for Data-Efficient Action Recognition
Yuliang Zou
Jinwoo Choi
Qitong Wang
Jia-Bin Huang
22
39
0
30 Mar 2021
Broaden Your Views for Self-Supervised Video Learning
Broaden Your Views for Self-Supervised Video Learning
Adrià Recasens
Pauline Luc
Jean-Baptiste Alayrac
Luyu Wang
Ross Hemsley
...
Florent Altché
M. Valko
Jean-Bastien Grill
Aaron van den Oord
Andrew Zisserman
SSL
AI4TS
33
127
0
30 Mar 2021
Recognizing Actions in Videos from Unseen Viewpoints
Recognizing Actions in Videos from Unseen Viewpoints
A. Piergiovanni
Michael S. Ryoo
27
25
0
30 Mar 2021
Augmented Transformer with Adaptive Graph for Temporal Action Proposal
  Generation
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation
Shuning Chang
Pichao Wang
F. Wang
Hao Li
Jiashi Feng
ViT
50
41
0
30 Mar 2021
Robust Audio-Visual Instance Discrimination
Robust Audio-Visual Instance Discrimination
Pedro Morgado
Ishan Misra
Nuno Vasconcelos
SSL
22
110
0
29 Mar 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
30
2,098
0
29 Mar 2021
Unified Graph Structured Models for Video Understanding
Unified Graph Structured Models for Video Understanding
Anurag Arnab
Chen Sun
Cordelia Schmid
38
44
0
29 Mar 2021
Graph-based Facial Affect Analysis: A Review
Graph-based Facial Affect Analysis: A Review
Yang Liu
Xingming Zhang
Yante Li
Jinzhao Zhou
Xin-hui Li
Guoying Zhao
CVBM
51
24
0
29 Mar 2021
Busy-Quiet Video Disentangling for Video Classification
Busy-Quiet Video Disentangling for Video Classification
Guoxi Huang
A. Bors
28
6
0
29 Mar 2021
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action
  Localization
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization
Mengmeng Xu
Juan-Manuel Perez-Rua
Xiatian Zhu
Guohao Li
Brais Martinez
17
27
0
28 Mar 2021
A Comprehensive Review of the Video-to-Text Problem
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
13
17
0
27 Mar 2021
An Image is Worth 16x16 Words, What is a Video Worth?
An Image is Worth 16x16 Words, What is a Video Worth?
Gilad Sharir
Asaf Noy
Lihi Zelnik-Manor
ViT
27
121
0
25 Mar 2021
Temporal Context Aggregation Network for Temporal Action Proposal
  Refinement
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Zhiwu Qing
Haisheng Su
Weihao Gan
Dongliang Wang
Wei Wu
Xiang Wang
Yu Qiao
Junjie Yan
Changxin Gao
Nong Sang
30
173
0
24 Mar 2021
Learning Comprehensive Motion Representation for Action Recognition
Learning Comprehensive Motion Representation for Action Recognition
Mingyu Wu
Boyuan Jiang
Donghao Luo
Junchi Yan
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xiaokang Yang
27
12
0
23 Mar 2021
MoViNets: Mobile Video Networks for Efficient Video Recognition
MoViNets: Mobile Video Networks for Efficient Video Recognition
Dan Kondratyuk
Liangzhe Yuan
Yandong Li
Li Zhang
Mingxing Tan
Matthew A. Brown
Boqing Gong
21
228
0
21 Mar 2021
Responsible AI: Gender bias assessment in emotion recognition
Responsible AI: Gender bias assessment in emotion recognition
Artem Domnich
G. Anbarjafari
27
48
0
21 Mar 2021
PGT: A Progressive Method for Training Models on Long Videos
PGT: A Progressive Method for Training Models on Long Videos
Bo Pang
Gao Peng
Yizhuo Li
Cewu Lu
VLM
27
12
0
21 Mar 2021
Efficient Spatialtemporal Context Modeling for Action Recognition
Efficient Spatialtemporal Context Modeling for Action Recognition
Congqi Cao
Yue Lu
Yifan Zhang
Dengyang Jiang
Yanning Zhang
29
4
0
20 Mar 2021
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
24
128
0
19 Mar 2021
CLTA: Contents and Length-based Temporal Attention for Few-shot Action
  Recognition
CLTA: Contents and Length-based Temporal Attention for Few-shot Action Recognition
Yang Bo
Yangdi Lu
Wenbo He
VLM
38
0
0
18 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation
  Learning
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
31
33
0
18 Mar 2021
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual
  Transfer of Vision-Language Models
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Po-Yao (Bernie) Huang
Mandela Patrick
Junjie Hu
Graham Neubig
Florian Metze
Alexander G. Hauptmann
MLLM
VLM
26
56
0
16 Mar 2021
Skeleton Aware Multi-modal Sign Language Recognition
Skeleton Aware Multi-modal Sign Language Recognition
Songyao Jiang
Bin Sun
Lichen Wang
Yue Bai
Kunpeng Li
Y. Fu
SLR
33
167
0
16 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
39
165
0
11 Mar 2021
VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Jan van Gemert
VLM
25
11
0
05 Mar 2021
Slow-Fast Auditory Streams For Audio Recognition
Slow-Fast Auditory Streams For Audio Recognition
Evangelos Kazakos
Arsha Nagrani
Andrew Zisserman
Dima Damen
26
66
0
05 Mar 2021
Unsupervised Motion Representation Enhanced Network for Action
  Recognition
Unsupervised Motion Representation Enhanced Network for Action Recognition
Xiaohang Yang
Lingtong Kong
Jie Yang
21
4
0
05 Mar 2021
Simulating time to event prediction with spatiotemporal echocardiography
  deep learning
Simulating time to event prediction with spatiotemporal echocardiography deep learning
R. Shad
Nicolas Quach
R. Fong
P. Kasinpila
C. Bowles
...
Michelle C. Li
J. Teuteberg
John P. Cunningham
C. Langlotz
W. Hiesinger
25
0
0
03 Mar 2021
Previous
123...171819...242526
Next