v1v2 (latest)

Two-Stream Convolutional Networks for Action Recognition in Videos

Neural Information Processing Systems (NeurIPS), 2014

9 June 2014

Karen Simonyan

Andrew Zisserman

ArXiv (abs)PDF HTML

Papers citing "Two-Stream Convolutional Networks for Action Recognition in Videos"

50 / 2,340 papers shown

Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation

Yuanhao Zhai

Kevin Lin

Zhengyuan Yang

247

11 Jun 2024

ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World Egocentric Action Recognition

Sanjoy Kundu

Shubham Trehan

Sathyanarayanan N. Aakur

LM&Ro LRM

159

09 Jun 2024

Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data PerspectivesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

583

09 Jun 2024

DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark

160

04 Jun 2024

RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model

293

02 Jun 2024

Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model

248

28 May 2024

Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions

271

28 May 2024

MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities

Olga Fink

218

27 May 2024

Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception

Shuangpeng Han

Ziyu Wang

Mengmi Zhang

275

26 May 2024

Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers

Chau Pham

Bryan A. Plummer

243

26 May 2024

Planted: a dataset for planted forest identification from multi-satellite time series

L. M. Pazos-Outón

Cristina Nader Vasconcelos

200

24 May 2024

ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning

Cihang Xie

216

24 May 2024

From CNNs to Transformers in Multimodal Human Action Recognition: A Survey

Muhammad Bilal Shaikh

Syed Mohammed Shamsul Islam

Douglas Chai

Naveed Akhtar

347

22 May 2024

Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding

430

21 May 2024

GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition

241

18 May 2024

A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

247

13 May 2024

Deep video representation learning: a survey

217

10 May 2024

Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation

217

09 May 2024

A Survey on Backbones for Deep Video Action Recognition

176

09 May 2024

MERIT: Multi-view evidential learning for reliable and interpretable liver fibrosis staging

360

05 May 2024

Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy

Hoang-Quan Nguyen

Thanh-Dat Truong

Khoa Luu

287

02 May 2024

Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

433

18 Apr 2024

Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD)

Yiqiao Yin

131

15 Apr 2024

A Survey on Multimodal Wearable Sensor-based Human Action Recognition

Yan Yan

298

14 Apr 2024

ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition

157

13 Apr 2024

Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation

Swati Jindal

Mohit Yadav

Roberto Manduchi

159

08 Apr 2024

Towards more realistic human motion prediction with attention to motion coordination

Pengxiang Ding

Jianqin Yin

154

04 Apr 2024

Language Model Guided Interpretable Video Action ReasoningComputer Vision and Pattern Recognition (CVPR), 2024

226

02 Apr 2024

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu

Guozhen Zhang

Jing Tan

Gangshan Wu

Limin Wang

250

31 Mar 2024

Hypergraph-based Multi-View Action Recognition using Event Cameras

320

28 Mar 2024

OmniVid: A Generative Framework for Universal Video Understanding

Lu Yuan

Zuxuan Wu

Yu-Gang Jiang

VLM VGen

285

26 Mar 2024

Emotion Recognition from the perspective of Activity Recognition

Savinay Nagendra

Prapti Panigrahi

143

24 Mar 2024

Enhancing Video Transformers for Action Understanding with VLM-aided Training

221

24 Mar 2024

Towards Two-Stream Foveation-based Active Vision LearningIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2024

327

24 Mar 2024

VidLA: Video-Language Alignment at ScaleComputer Vision and Pattern Recognition (CVPR), 2024

Mamshad Nayeem Rizve

Fan Fei

Jayakrishnan Unnikrishnan

Mubarak Shah

225

21 Mar 2024

Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition

253

19 Mar 2024

VideoBadminton: A Video Dataset for Badminton Action Recognition

182

19 Mar 2024

Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception

Vijay John

Yasutomo Kawanishi

178

18 Mar 2024

A Survey of IMU Based Cross-Modal Transfer Learning in Human Activity Recognition

Abhi Kamboj

Minh Do

217

17 Mar 2024

Audio-Visual Segmentation via Unlabeled Frame Exploitation

327

17 Mar 2024

On the Utility of 3D Hand Poses for Action RecognitionEuropean Conference on Computer Vision (ECCV), 2024

Angela Yao

219

14 Mar 2024

SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-NetIEEE International Conference on Robotics and Automation (ICRA), 2024

Helin Cao

Sven Behnke

3DPC 3DGS

152

13 Mar 2024

VideoMamba: State Space Model for Efficient Video UnderstandingEuropean Conference on Computer Vision (ECCV), 2024

Yu Qiao

283

390

11 Mar 2024

Deep Learning Approaches for Human Action Recognition in Video Data

Yufei Xie

145

11 Mar 2024

Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition

250

11 Mar 2024

A spatiotemporal style transfer algorithm for dynamic visual stimulus generationNature Computational Science (Nat. Comput. Sci.), 2024

Antonino Greco

Markus Siegel

222

07 Mar 2024

Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

230

05 Mar 2024

Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams

Duy Tran Thanh

Yeejin Lee

Byeongkeun Kang

303

05 Mar 2024

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

Henghui Ding

344

03 Mar 2024

BEE-NET: A deep neural network to identify in-the-wild Bodily Expression of Emotions

Mohammad Mahdi Dehshibi

David Masip

200

21 Feb 2024