Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

27 November 2017

Papers citing "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?"

50 / 289 papers shown

Title
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer N. H. Phong B. Ribeiro 29 15 0 17 Feb 2023
sMRI-PatchNet: A novel explainable patch-based deep learning network for Alzheimer's disease diagnosis and discriminative atrophy localisation with Structural MRI Xin Zhang Liangxiu Han Lianghao Han Haoming Chen Darren Dancey Daoqiang Zhang MedIm 18 4 0 17 Feb 2023
Audio-Visual Contrastive Learning with Temporal Self-Supervision Simon Jenni Alexander Black John Collomosse SSL 31 15 0 15 Feb 2023
Toward Extremely Lightweight Distracted Driver Recognition With Distillation-Based Neural Architecture Search and Knowledge Transfer Dichao Liu T. Yamasaki Yu Wang K. Mase Jien Kato 28 27 0 09 Feb 2023
A deep local attention network for pre-operative lymph node metastasis prediction in pancreatic cancer via multiphase CT imaging Zhilin Zheng Xu Fang Jiawen Yao Mengmeng Zhu Le Lu ... Hong Lu Jian-Ping Lu Ling Zhang C. Shao Yun Bian MedIm 27 1 0 04 Jan 2023
Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos Xingxing Wei Songping Wang Huanqian Yan AAML 26 15 0 03 Jan 2023
Hierarchical Explanations for Video Action Recognition Sadaf Gulshad Teng Long Nanne van Noord FAtt 29 6 0 01 Jan 2023
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning J. Denize Jaonary Rabarisoa Astrid Orcesi Romain Hérault SSL 19 6 0 21 Dec 2022
A Survey on Human Action Recognition Zhou Shuchang 29 0 0 20 Dec 2022
Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning Xian Zhong Zipeng Li Shuqin Chen Kui Jiang Chen Chen Mang Ye DiffM VGen 27 40 0 28 Nov 2022
Dynamic Appearance: A Video Representation for Action Recognition with Joint Training Guoxi Huang A. Bors 27 1 0 23 Nov 2022
Explaining (Sarcastic) Utterances to Enhance Affect Understanding in Multimodal Dialogues Shivani Kumar Ishani Mondal Md. Shad Akhtar Tanmoy Chakraborty 22 9 0 20 Nov 2022
Eat-Radar: Continuous Fine-Grained Intake Gesture Detection Using FMCW Radar and 3D Temporal Convolutional Network with Attention C. Wang T. S. Kumar W. de Raedt Guido Camps Hans Hallez Bart Vanrumste 24 12 0 08 Nov 2022
Cross-Domain Local Characteristic Enhanced Deepfake Video Detection Zihan Liu Hanyi Wang Shilin Wang ViT 29 6 0 07 Nov 2022
Unsupervised Audio-Visual Lecture Segmentation Darshan Singh Anchit Gupta C. V. Jawahar Makarand Tapaswi VOS 24 4 0 29 Oct 2022
Adversarial Domain Adaptation for Action Recognition Around the Clock Anwaar Ulhaq 22 3 0 25 Oct 2022
An efficient deep neural network to find small objects in large 3D images Jungkyu Park Jakub Chlkedowski Stanislaw Jastrzebski Jan Witowski Yan Xu ... Melanie Wegener Linda Moy Laura Heacock B. Reig Krzysztof J. Geras MedIm 23 1 0 16 Oct 2022
Controllable Radiance Fields for Dynamic Face Synthesis Peiye Zhuang Liqian Ma Oluwasanmi Koyejo A. Schwing CVBM 3DH 18 11 0 11 Oct 2022
Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene Sulabh Shrestha Yimeng Li Jana Kosecka 3DPC SSL SSeg 49 2 0 04 Oct 2022
A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos Anil Batra Shreyank N. Gowda Frank Keller Laura Sevilla-Lara 44 5 0 30 Sep 2022
Thinking Hallucination for Video Captioning Nasib Ullah Partha Pratim Mohanta VLM 36 4 0 28 Sep 2022
Self-supervised Learning for Unintentional Action Prediction Olga Zatsarynna Yazan Abu Farha Juergen Gall SSL 44 8 0 24 Sep 2022
Leveraging Self-Supervised Training for Unintentional Action Recognition Enea Duka Anna Kukleva Bernt Schiele 38 1 0 23 Sep 2022
A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition Duc-Quang Vu T. Phung Jia-Ching Wang 27 9 0 03 Sep 2022
Towards cumulative race time regression in sports: I3D ConvNet transfer learning in ultra-distance running events David Freire-Obregón J. Lorenzo-Navarro Oliverio J. Santana Daniel Hernández-Sosa Modesto Castrillón-Santana 3DH 31 7 0 23 Aug 2022
Action Recognition based on Cross-Situational Action-object Statistics Satoshi Tsutsui Xizi Wang Guangyuan Weng Yayun Zhang David J. Crandall Chen Yu 43 2 0 15 Aug 2022
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos Juncheng Billy Li Junlin Xie Linchao Zhu Long Qian Siliang Tang ... Haochen Shi Shengyu Zhang Longhui Wei Qi Tian Yueting Zhuang 36 12 0 03 Aug 2022
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living Zdravko Marinov David Schneider Alina Roitberg Rainer Stiefelhagen VGen 32 2 0 03 Aug 2022
Class-Difficulty Based Methods for Long-Tailed Visual Recognition Saptarshi Sinha Hiroki Ohashi Katsuyuki Nakamura 27 31 0 29 Jul 2022
EgoEnv: Human-centric environment representations from egocentric video Tushar Nagarajan Santhosh Kumar Ramakrishnan Ruta Desai James M. Hillis Kristen Grauman EgoV 38 19 0 22 Jul 2022
ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network Nikolaos Gkalelis Dimitrios Daskalakis Vasileios Mezaris 19 10 0 20 Jul 2022
ERA: Expert Retrieval and Assembly for Early Action Prediction Lin Geng Foo Tianjiao Li Hossein Rahmani Qiuhong Ke Jun Liu 24 15 0 20 Jul 2022
Human-to-Robot Imitation in the Wild Shikhar Bahl Abhi Gupta Deepak Pathak 30 165 0 19 Jul 2022
Deformer: Towards Displacement Field Learning for Unsupervised Medical Image Registration Jiashun Chen Donghuan Lu Yu Zhang Dong Wei Munan Ning Xinyu Shi Zhe Xu Yefeng Zheng ViT MedIm 27 23 0 07 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review Hao Wang Bin Guo Y. Zeng Yasan Ding Chen Qiu Ying Zhang Li Yao Zhiwen Yu 32 2 0 02 Jul 2022
Self-Supervised Learning for Videos: A Survey Madeline Chantry Schiappa Yogesh S Rawat M. Shah SSL 36 131 0 18 Jun 2022
Analysis and Extensions of Adversarial Training for Video Classification K. A. Kinfu René Vidal AAML 33 13 0 16 Jun 2022
NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition Hanting Li Ming-Fa Sui Zhaoqing Zhu Feng Zhao 25 27 0 10 Jun 2022
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines Camilo Luciano Fosco Emilie Josephs A. Andonian Allen Lee Xi Wang A. Oliva 44 4 0 01 Jun 2022
Cross-Architecture Self-supervised Video Representation Learning Sheng Guo Zihua Xiong Yujie Zhong Limin Wang Xiaobo Guo Bing Han Weilin Huang SSL AI4TS 76 24 0 26 May 2022
Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey Gaoang Wang Xiuming Zhang Lei Li VOT 61 15 0 22 May 2022
Support-set based Multi-modal Representation Enhancement for Video Captioning Xiaoya Chen Jingkuan Song Pengpeng Zeng Lianli Gao Hengtao Shen 24 4 0 19 May 2022
TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition Haodong Duan Nanxuan Zhao Kai-xiang Chen Dahua Lin ViT AI4TS 33 19 0 04 May 2022
The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction Alexandros Stergiou Dima Damen AI4TS EgoV EDL 17 7 0 28 Apr 2022
ClothFormer:Taming Video Virtual Try-on in All Module Jianbin Jiang Tan Wang He Yan Junhui Liu 38 24 0 26 Apr 2022
Probabilistic Representations for Video Contrastive Learning Jungin Park Jiyoung Lee Ig-Jae Kim Kwanghoon Sohn SSL 33 43 0 08 Apr 2022
Frequency Selective Augmentation for Video Representation Learning Jinhyung Kim Taeoh Kim Minho Shim Dongyoon Han Dongyoon Wee Junmo Kim AI4TS 49 3 0 08 Apr 2022
ObjectMix: Data Augmentation by Copy-Pasting Objects in Videos for Action Recognition Jun Kimata Tomoya Nitta Toru Tamaki 31 10 0 01 Apr 2022
End-to-End Active Speaker Detection Juan Carlos León Alcázar M. Cordes Chen Zhao Guohao Li 24 27 0 27 Mar 2022
V3GAN: Decomposing Background, Foreground and Motion for Video Generation Arti Keshari Sonam Gupta Sukhendu Das 24 3 0 26 Mar 2022