ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.09577
  4. Cited By
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

27 November 2017
Kensho Hara
Hirokatsu Kataoka
Y. Satoh
    3DPC
ArXivPDFHTML

Papers citing "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?"

50 / 289 papers shown
Title
Video Action Recognition Collaborative Learning with Dynamics via
  PSO-ConvNet Transformer
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer
N. H. Phong
B. Ribeiro
29
15
0
17 Feb 2023
sMRI-PatchNet: A novel explainable patch-based deep learning network for
  Alzheimer's disease diagnosis and discriminative atrophy localisation with
  Structural MRI
sMRI-PatchNet: A novel explainable patch-based deep learning network for Alzheimer's disease diagnosis and discriminative atrophy localisation with Structural MRI
Xin Zhang
Liangxiu Han
Lianghao Han
Haoming Chen
Darren Dancey
Daoqiang Zhang
MedIm
18
4
0
17 Feb 2023
Audio-Visual Contrastive Learning with Temporal Self-Supervision
Audio-Visual Contrastive Learning with Temporal Self-Supervision
Simon Jenni
Alexander Black
John Collomosse
SSL
31
15
0
15 Feb 2023
Toward Extremely Lightweight Distracted Driver Recognition With
  Distillation-Based Neural Architecture Search and Knowledge Transfer
Toward Extremely Lightweight Distracted Driver Recognition With Distillation-Based Neural Architecture Search and Knowledge Transfer
Dichao Liu
T. Yamasaki
Yu Wang
K. Mase
Jien Kato
28
27
0
09 Feb 2023
A deep local attention network for pre-operative lymph node metastasis
  prediction in pancreatic cancer via multiphase CT imaging
A deep local attention network for pre-operative lymph node metastasis prediction in pancreatic cancer via multiphase CT imaging
Zhilin Zheng
Xu Fang
Jiawen Yao
Mengmeng Zhu
Le Lu
...
Hong Lu
Jian-Ping Lu
Ling Zhang
C. Shao
Yun Bian
MedIm
27
1
0
04 Jan 2023
Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus
  on Videos
Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos
Xingxing Wei
Songping Wang
Huanqian Yan
AAML
26
15
0
03 Jan 2023
Hierarchical Explanations for Video Action Recognition
Hierarchical Explanations for Video Action Recognition
Sadaf Gulshad
Teng Long
Nanne van Noord
FAtt
29
6
0
01 Jan 2023
Similarity Contrastive Estimation for Image and Video Soft Contrastive
  Self-Supervised Learning
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning
J. Denize
Jaonary Rabarisoa
Astrid Orcesi
Romain Hérault
SSL
19
6
0
21 Dec 2022
A Survey on Human Action Recognition
A Survey on Human Action Recognition
Zhou Shuchang
29
0
0
20 Dec 2022
Refined Semantic Enhancement towards Frequency Diffusion for Video
  Captioning
Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning
Xian Zhong
Zipeng Li
Shuqin Chen
Kui Jiang
Chen Chen
Mang Ye
DiffM
VGen
27
40
0
28 Nov 2022
Dynamic Appearance: A Video Representation for Action Recognition with
  Joint Training
Dynamic Appearance: A Video Representation for Action Recognition with Joint Training
Guoxi Huang
A. Bors
27
1
0
23 Nov 2022
Explaining (Sarcastic) Utterances to Enhance Affect Understanding in
  Multimodal Dialogues
Explaining (Sarcastic) Utterances to Enhance Affect Understanding in Multimodal Dialogues
Shivani Kumar
Ishani Mondal
Md. Shad Akhtar
Tanmoy Chakraborty
22
9
0
20 Nov 2022
Eat-Radar: Continuous Fine-Grained Intake Gesture Detection Using FMCW
  Radar and 3D Temporal Convolutional Network with Attention
Eat-Radar: Continuous Fine-Grained Intake Gesture Detection Using FMCW Radar and 3D Temporal Convolutional Network with Attention
C. Wang
T. S. Kumar
W. de Raedt
Guido Camps
Hans Hallez
Bart Vanrumste
24
12
0
08 Nov 2022
Cross-Domain Local Characteristic Enhanced Deepfake Video Detection
Cross-Domain Local Characteristic Enhanced Deepfake Video Detection
Zihan Liu
Hanyi Wang
Shilin Wang
ViT
29
6
0
07 Nov 2022
Unsupervised Audio-Visual Lecture Segmentation
Unsupervised Audio-Visual Lecture Segmentation
Darshan Singh
Anchit Gupta
C. V. Jawahar
Makarand Tapaswi
VOS
24
4
0
29 Oct 2022
Adversarial Domain Adaptation for Action Recognition Around the Clock
Adversarial Domain Adaptation for Action Recognition Around the Clock
Anwaar Ulhaq
22
3
0
25 Oct 2022
An efficient deep neural network to find small objects in large 3D
  images
An efficient deep neural network to find small objects in large 3D images
Jungkyu Park
Jakub Chlkedowski
Stanislaw Jastrzebski
Jan Witowski
Yan Xu
...
Melanie Wegener
Linda Moy
Laura Heacock
B. Reig
Krzysztof J. Geras
MedIm
23
1
0
16 Oct 2022
Controllable Radiance Fields for Dynamic Face Synthesis
Controllable Radiance Fields for Dynamic Face Synthesis
Peiye Zhuang
Liqian Ma
Oluwasanmi Koyejo
A. Schwing
CVBM
3DH
18
11
0
11 Oct 2022
Self-supervised Pre-training for Semantic Segmentation in an Indoor
  Scene
Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene
Sulabh Shrestha
Yimeng Li
Jana Kosecka
3DPC
SSL
SSeg
49
2
0
04 Oct 2022
A Closer Look at Temporal Ordering in the Segmentation of Instructional
  Videos
A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos
Anil Batra
Shreyank N. Gowda
Frank Keller
Laura Sevilla-Lara
44
5
0
30 Sep 2022
Thinking Hallucination for Video Captioning
Thinking Hallucination for Video Captioning
Nasib Ullah
Partha Pratim Mohanta
VLM
36
4
0
28 Sep 2022
Self-supervised Learning for Unintentional Action Prediction
Self-supervised Learning for Unintentional Action Prediction
Olga Zatsarynna
Yazan Abu Farha
Juergen Gall
SSL
44
8
0
24 Sep 2022
Leveraging Self-Supervised Training for Unintentional Action Recognition
Leveraging Self-Supervised Training for Unintentional Action Recognition
Enea Duka
Anna Kukleva
Bernt Schiele
38
1
0
23 Sep 2022
A Novel Self-Knowledge Distillation Approach with Siamese Representation
  Learning for Action Recognition
A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition
Duc-Quang Vu
T. Phung
Jia-Ching Wang
27
9
0
03 Sep 2022
Towards cumulative race time regression in sports: I3D ConvNet transfer
  learning in ultra-distance running events
Towards cumulative race time regression in sports: I3D ConvNet transfer learning in ultra-distance running events
David Freire-Obregón
J. Lorenzo-Navarro
Oliverio J. Santana
Daniel Hernández-Sosa
Modesto Castrillón-Santana
3DH
31
7
0
23 Aug 2022
Action Recognition based on Cross-Situational Action-object Statistics
Action Recognition based on Cross-Situational Action-object Statistics
Satoshi Tsutsui
Xizi Wang
Guangyuan Weng
Yayun Zhang
David J. Crandall
Chen Yu
43
2
0
15 Aug 2022
Dilated Context Integrated Network with Cross-Modal Consensus for
  Temporal Emotion Localization in Videos
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos
Juncheng Billy Li
Junlin Xie
Linchao Zhu
Long Qian
Siliang Tang
...
Haochen Shi
Shengyu Zhang
Longhui Wei
Qi Tian
Yueting Zhuang
36
12
0
03 Aug 2022
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real
  Recognition of Activities of Daily Living
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living
Zdravko Marinov
David Schneider
Alina Roitberg
Rainer Stiefelhagen
VGen
32
2
0
03 Aug 2022
Class-Difficulty Based Methods for Long-Tailed Visual Recognition
Class-Difficulty Based Methods for Long-Tailed Visual Recognition
Saptarshi Sinha
Hiroki Ohashi
Katsuyuki Nakamura
27
31
0
29 Jul 2022
EgoEnv: Human-centric environment representations from egocentric video
EgoEnv: Human-centric environment representations from egocentric video
Tushar Nagarajan
Santhosh Kumar Ramakrishnan
Ruta Desai
James M. Hillis
Kristen Grauman
EgoV
38
19
0
22 Jul 2022
ViGAT: Bottom-up event recognition and explanation in video using
  factorized graph attention network
ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network
Nikolaos Gkalelis
Dimitrios Daskalakis
Vasileios Mezaris
19
10
0
20 Jul 2022
ERA: Expert Retrieval and Assembly for Early Action Prediction
ERA: Expert Retrieval and Assembly for Early Action Prediction
Lin Geng Foo
Tianjiao Li
Hossein Rahmani
Qiuhong Ke
Jun Liu
24
15
0
20 Jul 2022
Human-to-Robot Imitation in the Wild
Human-to-Robot Imitation in the Wild
Shikhar Bahl
Abhi Gupta
Deepak Pathak
30
165
0
19 Jul 2022
Deformer: Towards Displacement Field Learning for Unsupervised Medical
  Image Registration
Deformer: Towards Displacement Field Learning for Unsupervised Medical Image Registration
Jiashun Chen
Donghuan Lu
Yu Zhang
Dong Wei
Munan Ning
Xinyu Shi
Zhe Xu
Yefeng Zheng
ViT
MedIm
27
23
0
07 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context
  Augmented Dialogue System: A Review
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
32
2
0
02 Jul 2022
Self-Supervised Learning for Videos: A Survey
Self-Supervised Learning for Videos: A Survey
Madeline Chantry Schiappa
Yogesh S Rawat
M. Shah
SSL
36
131
0
18 Jun 2022
Analysis and Extensions of Adversarial Training for Video Classification
Analysis and Extensions of Adversarial Training for Video Classification
K. A. Kinfu
René Vidal
AAML
33
13
0
16 Jun 2022
NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression
  Recognition
NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition
Hanting Li
Ming-Fa Sui
Zhaoqing Zhu
Feng Zhao
25
27
0
10 Jun 2022
Deepfake Caricatures: Amplifying attention to artifacts increases
  deepfake detection by humans and machines
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Camilo Luciano Fosco
Emilie Josephs
A. Andonian
Allen Lee
Xi Wang
A. Oliva
44
4
0
01 Jun 2022
Cross-Architecture Self-supervised Video Representation Learning
Cross-Architecture Self-supervised Video Representation Learning
Sheng Guo
Zihua Xiong
Yujie Zhong
Limin Wang
Xiaobo Guo
Bing Han
Weilin Huang
SSL
AI4TS
76
24
0
26 May 2022
Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey
Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey
Gaoang Wang
Xiuming Zhang
Lei Li
VOT
61
15
0
22 May 2022
Support-set based Multi-modal Representation Enhancement for Video
  Captioning
Support-set based Multi-modal Representation Enhancement for Video Captioning
Xiaoya Chen
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Hengtao Shen
24
4
0
19 May 2022
TransRank: Self-supervised Video Representation Learning via
  Ranking-based Transformation Recognition
TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition
Haodong Duan
Nanxuan Zhao
Kai-xiang Chen
Dahua Lin
ViT
AI4TS
33
19
0
04 May 2022
The Wisdom of Crowds: Temporal Progressive Attention for Early Action
  Prediction
The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction
Alexandros Stergiou
Dima Damen
AI4TS
EgoV
EDL
17
7
0
28 Apr 2022
ClothFormer:Taming Video Virtual Try-on in All Module
ClothFormer:Taming Video Virtual Try-on in All Module
Jianbin Jiang
Tan Wang
He Yan
Junhui Liu
38
24
0
26 Apr 2022
Probabilistic Representations for Video Contrastive Learning
Probabilistic Representations for Video Contrastive Learning
Jungin Park
Jiyoung Lee
Ig-Jae Kim
Kwanghoon Sohn
SSL
33
43
0
08 Apr 2022
Frequency Selective Augmentation for Video Representation Learning
Frequency Selective Augmentation for Video Representation Learning
Jinhyung Kim
Taeoh Kim
Minho Shim
Dongyoon Han
Dongyoon Wee
Junmo Kim
AI4TS
49
3
0
08 Apr 2022
ObjectMix: Data Augmentation by Copy-Pasting Objects in Videos for
  Action Recognition
ObjectMix: Data Augmentation by Copy-Pasting Objects in Videos for Action Recognition
Jun Kimata
Tomoya Nitta
Toru Tamaki
31
10
0
01 Apr 2022
End-to-End Active Speaker Detection
End-to-End Active Speaker Detection
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
24
27
0
27 Mar 2022
V3GAN: Decomposing Background, Foreground and Motion for Video
  Generation
V3GAN: Decomposing Background, Foreground and Motion for Video Generation
Arti Keshari
Sonam Gupta
Sukhendu Das
24
3
0
26 Mar 2022
Previous
123456
Next