Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.09577
Cited By
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
27 November 2017
Kensho Hara
Hirokatsu Kataoka
Y. Satoh
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?"
50 / 289 papers shown
Title
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
22
22
0
16 Mar 2022
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Chengpeng Dai
Fuhai Chen
Xiaoshuai Sun
Rongrong Ji
QiXiang Ye
Yongjian Wu
22
1
0
13 Mar 2022
Synopses of Movie Narratives: a Video-Language Dataset for Story Understanding
Yidan Sun
Qin Chao
Yangfeng Ji
Boyang Albert Li
VGen
37
10
0
11 Mar 2022
Generative Cooperative Learning for Unsupervised Video Anomaly Detection
M. Zaheer
Arif Mahmood
M. H. Khan
Mattia Segu
Feng Yu
Seung-Ik Lee
AI4TS
27
130
0
08 Mar 2022
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control
Simone Parisi
Aravind Rajeswaran
Senthil Purushwalkam
Abhinav Gupta
LM&Ro
34
187
0
07 Mar 2022
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Yuanzhong Liu
Junsong Yuan
Zhigang Tu
27
58
0
24 Feb 2022
COMPASS: Contrastive Multimodal Pretraining for Autonomous Systems
Shuang Ma
Sai H. Vemprala
Wenshan Wang
Jayesh K. Gupta
Yale Song
Daniel J. McDuff
Ashish Kapoor
SSL
37
9
0
20 Feb 2022
Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives
David T. Hoffmann
Nadine Behrmann
Juergen Gall
Thomas Brox
M. Noroozi
41
43
0
27 Jan 2022
Autoencoding Video Latents for Adversarial Video Generation
Sai Hemanth Kasaraneni
VGen
30
3
0
18 Jan 2022
Action Keypoint Network for Efficient Video Recognition
Xu Chen
Yahong Han
Xiaohan Wang
Yifang Sun
Yi Yang
3DPC
27
6
0
17 Jan 2022
Review of Face Presentation Attack Detection Competitions
Zitong Yu
Jukka Komulainen
Z. Boulkenafet
Zahid Akhtar
AAML
CVBM
35
11
0
21 Dec 2021
Adversarial Memory Networks for Action Prediction
Zhiqiang Tao
Yue Bai
Handong Zhao
Sheng Li
Yuanyuan Kong
Y. Fu
GAN
18
2
0
18 Dec 2021
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering
Junbin Xiao
Angela Yao
Zhiyuan Liu
Yicong Li
Wei Ji
Tat-Seng Chua
30
111
0
12 Dec 2021
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Keli Zhang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
32
21
0
09 Dec 2021
MASTAF: A Model-Agnostic Spatio-Temporal Attention Fusion Network for Few-shot Video Classification
Rex Liu
Huan Zhang
Hamed Pirsiavash
Xin Liu
ViT
25
11
0
08 Dec 2021
Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Nina Shvetsova
Brian Chen
Andrew Rouditchenko
Samuel Thomas
Brian Kingsbury
Rogerio Feris
David Harwath
James R. Glass
Hilde Kuehne
ViT
34
128
0
08 Dec 2021
ViewCLR: Learning Self-supervised Video Representation for Unseen Viewpoints
Srijan Das
Michael S. Ryoo
SSL
39
17
0
07 Dec 2021
Time-Equivariant Contrastive Video Representation Learning
Simon Jenni
Hailin Jin
SSL
AI4TS
143
60
0
07 Dec 2021
Video-Text Pre-training with Learned Regions
Rui Yan
Mike Zheng Shou
Yixiao Ge
Alex Jinpeng Wang
Xudong Lin
Guanyu Cai
Jinhui Tang
33
23
0
02 Dec 2021
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter
Bang-ju Yang
Tong Zhang
Yuexian Zou
CLIP
25
20
0
30 Nov 2021
Hierarchical Modular Network for Video Captioning
Hanhua Ye
Guorong Li
Yuankai Qi
Shuhui Wang
Qingming Huang
Ming-Hsuan Yang
27
67
0
24 Nov 2021
Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang
Xitong Yang
Hengduo Li
Li Liu
Zuxuan Wu
Yu-Gang Jiang
ViT
21
63
0
23 Nov 2021
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
Yaya Shi
Xu Yang
Haiyang Xu
Chunfen Yuan
Bing Li
Weiming Hu
Zhengjun Zha
39
33
0
17 Nov 2021
Cascaded Multilingual Audio-Visual Learning from Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Samuel Thomas
Hilde Kuehne
...
Yikang Shen
Rogerio Feris
Brian Kingsbury
M. Picheny
James R. Glass
116
8
0
08 Nov 2021
A trained humanoid robot can perform human-like crossmodal social attention and conflict resolution
Di Fu
Fares Abawi
Hugo C. C. Carneiro
Matthias Kerzel
Ziwei Chen
Erik Strahl
Xun Liu
S. Wermter
17
6
0
02 Nov 2021
ES-ImageNet: A Million Event-Stream Classification Dataset for Spiking Neural Networks
Yihan Lin
Wei Ding
Shaohua Qiang
Lei Deng
Guoqi Li
AI4TS
30
31
0
23 Oct 2021
TEAM-Net: Multi-modal Learning for Video Action Recognition with Partial Decoding
Zhengwei Wang
Qi She
A. Smolic
21
9
0
17 Oct 2021
NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels
Mohit Sharma
Rajkumar Patra
Harshali Desai
Shruti Vyas
Yogesh S Rawat
R. Shah
VGen
NoLa
24
3
0
13 Oct 2021
TAda! Temporally-Adaptive Convolutions for Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Mingqian Tang
Ziwei Liu
M. Ang
48
49
0
12 Oct 2021
Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction
Rishubh Parihar
Gaurav Ramola
Ranajit Saha
Raviprasad Kini
Aniket Rege
S. Velusamy
33
1
0
03 Oct 2021
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation
Jay Patravali
Gaurav Mittal
Ye Yu
Fuxin Li
Mei Chen
18
19
0
30 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Jackson Wang
Song Han
40
64
0
27 Sep 2021
Long Short View Feature Decomposition via Contrastive Video Representation Learning
Nadine Behrmann
Mohsen Fayyaz
Juergen Gall
M. Noroozi
18
36
0
23 Sep 2021
Unsupervised Abstract Reasoning for Raven's Problem Matrices
Tao Zhuo
Qian Huang
Mohan S. Kankanhalli
LRM
113
22
0
21 Sep 2021
Audio-Visual Collaborative Representation Learning for Dynamic Saliency Prediction
Hailong Ning
Bin Zhao
Zhanxuan Hu
Lang He
Ercheng Pei
32
10
0
17 Sep 2021
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization
Tiezheng Yu
Wenliang Dai
Zihan Liu
Pascale Fung
32
73
0
06 Sep 2021
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
Jianwei Yang
Yonatan Bisk
Jianfeng Gao
27
137
0
23 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
26
77
0
20 Aug 2021
Multi-Camera Trajectory Forecasting with Trajectory Tensors
Olly Styles
T. Guha
Victor Sanchez
27
7
0
10 Aug 2021
Weakly Supervised Attention Model for RV StrainClassification from volumetric CTPA Scans
Noa Cahan
E. Marom
S. Soffer
Y. Barash
Eli Konen
Eyal Klang
H. Greenspan
42
8
0
26 Jul 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
F. Brémond
27
47
0
19 Jul 2021
Self-supervised Representation Learning Framework for Remote Physiological Measurement Using Spatiotemporal Augmentation Loss
Hao Wang
Euijoon Ahn
Jinman Kim
29
46
0
16 Jul 2021
Aligning Correlation Information for Domain Adaptation in Action Recognition
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
24
38
0
11 Jul 2021
Federated Learning for Multi-Center Imaging Diagnostics: A Study in Cardiovascular Disease
Akis Linardos
Kaisar Kushibar
S. Walsh
P. Gkontra
Karim Lekadir
FedML
25
63
0
07 Jul 2021
Deep Learning for Micro-expression Recognition: A Survey
Yante Li
Jinsheng Wei
Yang Liu
Janne Kauttonen
Guoying Zhao
38
61
0
06 Jul 2021
Video Summarization through Reinforcement Learning with a 3D Spatio-Temporal U-Net
Tianrui Liu
Qingjie Meng
Jun-Jie Huang
Athanasios Vlontzos
Daniel Rueckert
Bernhard Kainz
OffRL
AI4TS
24
70
0
19 Jun 2021
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Martine Toering
Ioannis Gatopoulos
M. Stol
Vincent Tao Hu
SSL
40
11
0
18 Jun 2021
ShuffleBlock: Shuffle to Regularize Deep Convolutional Neural Networks
Sudhakar Kumawat
Gagan Kanojia
Shanmuganathan Raman
21
5
0
17 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
29
45
0
07 Jun 2021
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions
Junbin Xiao
Xindi Shang
Angela Yao
Tat-Seng Chua
45
444
0
18 May 2021
Previous
1
2
3
4
5
6
Next