ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2199
  4. Cited By
Two-Stream Convolutional Networks for Action Recognition in Videos
v1v2 (latest)

Two-Stream Convolutional Networks for Action Recognition in Videos

Neural Information Processing Systems (NeurIPS), 2014
9 June 2014
Karen Simonyan
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Two-Stream Convolutional Networks for Action Recognition in Videos"

50 / 2,340 papers shown
Motion Consistency Model: Accelerating Video Diffusion with Disentangled
  Motion-Appearance Distillation
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
Yuanhao Zhai
Kevin Lin
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Chung-Ching Lin
David Doermann
Junsong Yuan
Lijuan Wang
VGenDiffM
247
26
0
11 Jun 2024
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World
  Egocentric Action Recognition
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World Egocentric Action Recognition
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LM&RoLRM
159
1
0
09 Jun 2024
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data PerspectivesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Thong Nguyen
Yi Bin
Junbin Xiao
Leigang Qu
Yicong Li
Jay Zhangjie Wu
Cong-Duy Nguyen
See-Kiong Ng
Luu Anh Tuan
VLM
583
26
1
09 Jun 2024
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the
  Dark
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark
Chi-Jui Chang
Oscar Tai-Yuan Chen
Vincent S. Tseng
VLM
160
3
0
04 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a
  Hybrid Model
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedImViT
293
26
0
02 Jun 2024
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space
  Model
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model
Wenbing Li
Hang Zhou
Junqing Yu
Zikai Song
Wei Yang
Mamba
248
29
0
28 May 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach
  with Hierarchical Interactions
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
271
1
0
28 May 2024
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
Hao Dong
Yue Zhao
Eleni Chatzi
Olga Fink
OODD
218
25
0
27 May 2024
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to
  Biological Motion Perception
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
Shuangpeng Han
Ziyu Wang
Mengmi Zhang
275
1
0
26 May 2024
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
Chau Pham
Bryan A. Plummer
243
8
0
26 May 2024
Planted: a dataset for planted forest identification from
  multi-satellite time series
Planted: a dataset for planted forest identification from multi-satellite time series
L. M. Pazos-Outón
Cristina Nader Vasconcelos
Anton Raichuk
Anurag Arnab
Dan Morris
Maxim Neumann
200
8
0
24 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video
  Representation Learning
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Yaoyao Liu
Cihang Xie
AI4TSVGenSSL
216
2
0
24 May 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A
  Survey
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
347
30
0
22 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
430
7
0
21 May 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic
  Hand Gesture Recognition
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLRViT
241
14
0
18 May 2024
A Semantic and Motion-Aware Spatiotemporal Transformer Network for
  Action Detection
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Matthew Korban
Peter Youngs
Scott T. Acton
ViT
247
16
0
13 May 2024
Deep video representation learning: a survey
Deep video representation learning: a survey
Elham Ravanbakhsh
Yongqing Liang
J. Ramanujam
Xin Li
217
5
0
10 May 2024
Multi-Stream Keypoint Attention Network for Sign Language Recognition
  and Translation
Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation
Mo Guan
Yan Wang
Guangkun Ma
Jiarui Liu
Mingzu Sun
SLR
217
14
0
09 May 2024
A Survey on Backbones for Deep Video Action Recognition
A Survey on Backbones for Deep Video Action Recognition
Zixuan Tang
Youjun Zhao
Yuhang Wen
Mengyuan Liu
176
3
0
09 May 2024
MERIT: Multi-view evidential learning for reliable and interpretable liver fibrosis staging
MERIT: Multi-view evidential learning for reliable and interpretable liver fibrosis staging
Yuanye Liu
Zheyao Gao
Nannan Shi
Fuping Wu
Yuxin Shi
Qingchao Chen
Xiahai Zhuang
360
0
0
05 May 2024
Multi-view Action Recognition via Directed Gromov-Wasserstein
  Discrepancy
Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy
Hoang-Quan Nguyen
Thanh-Dat Truong
Khoa Luu
287
1
0
02 May 2024
Simultaneous Detection and Interaction Reasoning for Object-Centric
  Action Recognition
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
433
6
0
18 Apr 2024
Vision Augmentation Prediction Autoencoder with Attention Design
  (VAPAAD)
Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD)
Yiqiao Yin
131
0
0
15 Apr 2024
A Survey on Multimodal Wearable Sensor-based Human Action Recognition
A Survey on Multimodal Wearable Sensor-based Human Action Recognition
Jianyuan Ni
Hao Tang
Syed Tousiful Haque
Yan Yan
A. Ngu
298
22
0
14 Apr 2024
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
157
5
0
13 Apr 2024
Spatio-Temporal Attention and Gaussian Processes for Personalized Video
  Gaze Estimation
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
Swati Jindal
Mohit Yadav
Roberto Manduchi
159
12
0
08 Apr 2024
Towards more realistic human motion prediction with attention to motion
  coordination
Towards more realistic human motion prediction with attention to motion coordination
Pengxiang Ding
Jianqin Yin
154
20
0
04 Apr 2024
Language Model Guided Interpretable Video Action Reasoning
Language Model Guided Interpretable Video Action ReasoningComputer Vision and Pattern Recognition (CVPR), 2024
Ning Wang
Guangming Zhu
HS Li
Liang Zhang
Syed Afaq Ali Shah
Mohammed Bennamoun
226
7
0
02 Apr 2024
Dual DETRs for Multi-Label Temporal Action Detection
Dual DETRs for Multi-Label Temporal Action Detection
Yuhan Zhu
Guozhen Zhang
Jing Tan
Gangshan Wu
Limin Wang
250
22
0
31 Mar 2024
Hypergraph-based Multi-View Action Recognition using Event Cameras
Hypergraph-based Multi-View Action Recognition using Event Cameras
Yue Gao
Jiaxuan Lu
Siqi Li
Yipeng Li
Shaoyi Du
320
25
0
28 Mar 2024
OmniVid: A Generative Framework for Universal Video Understanding
OmniVid: A Generative Framework for Universal Video Understanding
Junke Wang
Dongdong Chen
Chong Luo
Bo He
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
VLMVGen
285
29
0
26 Mar 2024
Emotion Recognition from the perspective of Activity Recognition
Emotion Recognition from the perspective of Activity Recognition
Savinay Nagendra
Prapti Panigrahi
143
2
0
24 Mar 2024
Enhancing Video Transformers for Action Understanding with VLM-aided
  Training
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Hui Lu
Hu Jian
Ronald Poppe
A. A. Salah
221
6
0
24 Mar 2024
Towards Two-Stream Foveation-based Active Vision Learning
Towards Two-Stream Foveation-based Active Vision LearningIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2024
Timur Ibrayev
Amitangshu Mukherjee
Sai Aparna Aketi
Kaushik Roy
327
5
0
24 Mar 2024
VidLA: Video-Language Alignment at Scale
VidLA: Video-Language Alignment at ScaleComputer Vision and Pattern Recognition (CVPR), 2024
Mamshad Nayeem Rizve
Fan Fei
Jayakrishnan Unnikrishnan
Son Tran
Benjamin Z. Yao
Belinda Zeng
Mubarak Shah
Trishul Chilimbi
VLMAI4TS
225
8
0
21 Mar 2024
Selective, Interpretable, and Motion Consistent Privacy Attribute
  Obfuscation for Action Recognition
Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
Filip Ilic
Henghui Zhao
Thomas Pock
Richard P. Wildes
PICVAAML
253
3
0
19 Mar 2024
VideoBadminton: A Video Dataset for Badminton Action Recognition
VideoBadminton: A Video Dataset for Badminton Action Recognition
Qi Li
Tzu-Chen Chiu
Hsiang-Wei Huang
Minmin Sun
Wei-Shinn Ku
182
1
0
19 Mar 2024
Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level
  Perception
Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception
Vijay John
Yasutomo Kawanishi
178
0
0
18 Mar 2024
A Survey of IMU Based Cross-Modal Transfer Learning in Human Activity
  Recognition
A Survey of IMU Based Cross-Modal Transfer Learning in Human Activity Recognition
Abhi Kamboj
Minh Do
217
5
0
17 Mar 2024
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu
Yikun Liu
Fei Zhang
Chen Ju
Ya Zhang
Yanfeng Wang
327
27
0
17 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
On the Utility of 3D Hand Poses for Action RecognitionEuropean Conference on Computer Vision (ECCV), 2024
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
219
13
0
14 Mar 2024
SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion
  using a 3D Recurrent U-Net
SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-NetIEEE International Conference on Robotics and Automation (ICRA), 2024
Helin Cao
Sven Behnke
3DPC3DGS
152
11
0
13 Mar 2024
VideoMamba: State Space Model for Efficient Video Understanding
VideoMamba: State Space Model for Efficient Video UnderstandingEuropean Conference on Computer Vision (ECCV), 2024
Kunchang Li
Xinhao Li
Yi Wang
Yinan He
Yali Wang
Limin Wang
Yu Qiao
Mamba
283
390
0
11 Mar 2024
Deep Learning Approaches for Human Action Recognition in Video Data
Deep Learning Approaches for Human Action Recognition in Video Data
Yufei Xie
145
1
0
11 Mar 2024
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for
  Distracted Driver Action Recognition
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition
Erkut Akdag
Zeqi Zhu
Egor Bondarev
Peter H. N. de With
ViT
250
8
0
11 Mar 2024
A spatiotemporal style transfer algorithm for dynamic visual stimulus
  generation
A spatiotemporal style transfer algorithm for dynamic visual stimulus generationNature Computational Science (Nat. Comput. Sci.), 2024
Antonino Greco
Markus Siegel
222
7
0
07 Mar 2024
Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits
Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits
Sahil Sidheekh
Pranuthi Tenali
Saurabh Mathur
Erik Blasch
Kristian Kersting
S. Natarajan
230
4
0
05 Mar 2024
Enhancing Long-Term Person Re-Identification Using Global, Local Body
  Part, and Head Streams
Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams
Duy Tran Thanh
Yeejin Lee
Byeongkeun Kang
303
5
0
05 Mar 2024
Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary
  Action Recognition
Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition
Kun-Yu Lin
Henghui Ding
Jiaming Zhou
Yu-Ming Tang
Yi-Xing Peng
Zhilin Zhao
Chen Change Loy
Wei-Shi Zheng
VLM
344
21
0
03 Mar 2024
BEE-NET: A deep neural network to identify in-the-wild Bodily Expression
  of Emotions
BEE-NET: A deep neural network to identify in-the-wild Bodily Expression of Emotions
Mohammad Mahdi Dehshibi
David Masip
200
2
0
21 Feb 2024
Previous
12345...454647
Next