Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.11248
Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition
30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Closer Look at Spatiotemporal Convolutions for Action Recognition"
50 / 1,270 papers shown
Title
VADER: Video Alignment Differencing and Retrieval
Alexander Black
Simon Jenni
Tu Bui
Md. Mehrab Tanjim
Stefano Petrangeli
Ritwik Sinha
Viswanathan Swaminathan
John Collomosse
31
2
0
23 Mar 2023
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
Yiting Cheng
Fangyun Wei
Jianmin Bao
Dong Chen
Wenqian Zhang
SLR
32
28
0
22 Mar 2023
VMCML: Video and Music Matching via Cross-Modality Lifting
Yi-Shan Lee
Wei-Cheng Tseng
Fu-En Wang
Min Sun
23
0
0
22 Mar 2023
Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
Sixun Dong
Huazhang Hu
Dongze Lian
Weixin Luo
Yichen Qian
Shenghua Gao
ViT
AI4TS
23
11
0
22 Mar 2023
Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization
Fida Mohammad Thoker
Hazel Doughty
Cees G. M. Snoek
ViT
48
9
0
20 Mar 2023
Leaping Into Memories: Space-Time Deep Feature Synthesis
Alexandros Stergiou
Nikos Deligiannis
34
0
0
17 Mar 2023
Dual-path Adaptation from Image to Video Transformers
Jungin Park
Jiyoung Lee
Kwanghoon Sohn
ViT
21
37
0
17 Mar 2023
Video Action Recognition with Attentive Semantic Units
Yifei Chen
Dapeng Chen
Ruijin Liu
Hao Li
Wei Peng
21
11
0
17 Mar 2023
Enhanced detection of the presence and severity of COVID-19 from CT scans using lung segmentation
R. Turnbull
38
2
0
16 Mar 2023
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization
Tuan N. Tang
Kwonyoung Kim
Kwanghoon Sohn
29
29
0
16 Mar 2023
Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation
Minghui Zhang
Yang Wu
Hanxiao Zhang
Yulei Qin
Hao Zheng
...
Raúl San José Estépar
C. Espinosa
Jiayuan Sun
Guang-Zhong Yang
Yun Gu
15
12
0
10 Mar 2023
Human Pose Estimation from Ambiguous Pressure Recordings with Spatio-temporal Masked Transformers
Vandad Davoodnia
Ali Etemad
ViT
29
6
0
10 Mar 2023
A Light Weight Model for Active Speaker Detection
Junhua Liao
Haihan Duan
Kanghui Feng
Wanbing Zhao
Yanbing Yang
Liangyin Chen
35
36
0
08 Mar 2023
VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building [Technical Report]
Maureen Daum
Enhao Zhang
Dong He
Stephen Mussmann
Brandon Haynes
Ranjay Krishna
Magdalena Balazinska
32
4
0
07 Mar 2023
Continuous Sign Language Recognition with Correlation Network
Lianyu Hu
Liqing Gao
Zekang Liu
Wei Feng
SLR
40
58
0
06 Mar 2023
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition
Junyan Wang
Zhenhong Sun
Yichen Qian
Dong Gong
Xiuyu Sun
Ming Lin
Maurice Pagnucco
Yang Song
3DPC
20
11
0
05 Mar 2023
Heterogeneous Graph Learning for Acoustic Event Classification
A. Shirian
Mona Ahmadian
Krishna Somandepalli
T. Guha
30
2
0
05 Mar 2023
The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis
Haoxu Wang
Ming Cheng
Qiang Fu
Ming Li
39
8
0
04 Mar 2023
Improving Audio-Visual Video Parsing with Pseudo Visual Labels
Jinxing Zhou
Dan Guo
Yiran Zhong
Meng Wang
VLM
39
13
0
04 Mar 2023
Texture-Based Input Feature Selection for Action Recognition
Yalong Jiang
24
0
0
28 Feb 2023
Brain subtle anomaly detection based on auto-encoders latent space analysis : application to de novo parkinson patients
Nicolas Pinon
Geoffroy Oudoumanessah
Robin Trombetta
M. Dojat
Florence Forbes
Carole Lartizien
17
6
0
27 Feb 2023
Deep Learning for Video-Text Retrieval: a Review
Cunjuan Zhu
Qi Jia
Wei Chen
Yanming Guo
Yu Liu
24
14
0
24 Feb 2023
Boosting Adversarial Transferability using Dynamic Cues
Muzammal Naseer
Ahmad A Mahmood
Salman Khan
Fahad Shahbaz Khan
AAML
28
5
0
23 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring
Zhihao Chen
Chuang Niu
Qi Gao
Ge Wang
Hongming Shan
MedIm
ViT
3DV
36
20
0
21 Feb 2023
Audio-Visual Contrastive Learning with Temporal Self-Supervision
Simon Jenni
Alexander Black
John Collomosse
SSL
31
15
0
15 Feb 2023
Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling via Bayesian Neural Networks
Weihang Dai
Xuelong Li
Kwang-Ting Cheng
BDL
UQCV
27
14
0
15 Feb 2023
Adjacent-Level Feature Cross-Fusion With 3-D CNN for Remote Sensing Image Change Detection
Y. Ye
Mengmeng Wang
Liang Zhou
Guangyang Lei
Jianwei Fan
Yao Qin
3DPC
27
37
0
10 Feb 2023
Reversible Vision Transformers
K. Mangalam
Haoqi Fan
Yanghao Li
Chaoxiong Wu
Bo Xiong
Christoph Feichtenhofer
Jitendra Malik
ViT
11
45
0
09 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
Chong Chen
Mu Li
ViT
58
144
0
06 Feb 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
42
2
0
26 Jan 2023
CNN-Based Action Recognition and Pose Estimation for Classifying Animal Behavior from Videos: A Survey
Michael Perez
Corey Toler-Franklin
MedIm
36
14
0
15 Jan 2023
Deep Diversity-Enhanced Feature Representation of Hyperspectral Images
Jinhui Hou
Zhiyu Zhu
Junhui Hou
Hui Liu
Huanqiang Zeng
Deyu Meng
18
6
0
15 Jan 2023
Triple-stream Deep Metric Learning of Great Ape Behavioural Actions
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
33
14
0
06 Jan 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding
Shuhan Tan
Tushar Nagarajan
Kristen Grauman
26
21
0
05 Jan 2023
Test of Time: Instilling Video-Language Models with a Sense of Time
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
86
36
0
05 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang
Mitesh Singh
Lorenzo Torresani
EgoV
72
24
0
03 Jan 2023
Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition
Hasan Hammoud
Shuming Liu
Mohammad Alkhrashi
Fahad Albalawi
Guohao Li
AAML
34
8
0
03 Jan 2023
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Wenhao Wu
Xiaohan Wang
Haipeng Luo
Jingdong Wang
Yi Yang
Wanli Ouyang
106
48
0
31 Dec 2022
Transformers in Action Recognition: A Review on Temporal Modeling
Elham Shabaninia
Hossein Nezamabadi-pour
Fatemeh Shafizadegan
ViT
29
8
0
29 Dec 2022
StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition
Xi Shen
Zhedong Zheng
Yi Yang
SLR
35
13
0
25 Dec 2022
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning
J. Denize
Jaonary Rabarisoa
Astrid Orcesi
Romain Hérault
SSL
19
6
0
21 Dec 2022
Deep set conditioned latent representations for action recognition
Akash Singh
Tom De Schepper
Kevin Mets
P. Hellinckx
José Oramas
Steven Latré
BDL
19
2
0
21 Dec 2022
A Survey on Human Action Recognition
Zhou Shuchang
29
0
0
20 Dec 2022
Ring That Bell: A Corpus and Method for Multimodal Metaphor Detection in Videos
Khalid Alnajjar
Mika Hämäläinen
Shuo Zhang
32
7
0
15 Dec 2022
Reconstructing Humpty Dumpty: Multi-feature Graph Autoencoder for Open Set Action Recognition
Dawei Du
Ameya Shringi
A. Hoogs
Christopher Funk
21
2
0
12 Dec 2022
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
Sangwon Kim
Dasom Ahn
ByoungChul Ko
ViT
3DPC
38
24
0
12 Dec 2022
Deep Architectures for Content Moderation and Movie Content Rating
Fatih Çagatay Akyön
A. Temi̇zel
38
4
0
08 Dec 2022
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Lu Yuan
Yu-Gang Jiang
VGen
32
87
0
08 Dec 2022
RainUNet for Super-Resolution Rain Movie Prediction under Spatio-temporal Shifts
Jinyoung Park
Minseok Son
Seungju Cho
Inyoung Lee
Changick Kim
8
3
0
07 Dec 2022
Multimodal Vision Transformers with Forced Attention for Behavior Analysis
Tanay Agrawal
Michal Balazia
Philippe Muller
Franccois Brémond
ViT
23
9
0
07 Dec 2022
Previous
1
2
3
...
7
8
9
...
24
25
26
Next