Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1406.2199
Cited By
v1
v2 (latest)
Two-Stream Convolutional Networks for Action Recognition in Videos
Neural Information Processing Systems (NeurIPS), 2014
9 June 2014
Karen Simonyan
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Two-Stream Convolutional Networks for Action Recognition in Videos"
50 / 2,340 papers shown
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
Yuanhao Zhai
Kevin Lin
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Chung-Ching Lin
David Doermann
Junsong Yuan
Lijuan Wang
VGen
DiffM
247
26
0
11 Jun 2024
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World Egocentric Action Recognition
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LM&Ro
LRM
159
1
0
09 Jun 2024
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Thong Nguyen
Yi Bin
Junbin Xiao
Leigang Qu
Yicong Li
Jay Zhangjie Wu
Cong-Duy Nguyen
See-Kiong Ng
Luu Anh Tuan
VLM
583
26
1
09 Jun 2024
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark
Chi-Jui Chang
Oscar Tai-Yuan Chen
Vincent S. Tseng
VLM
160
3
0
04 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
293
26
0
02 Jun 2024
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model
Wenbing Li
Hang Zhou
Junqing Yu
Zikai Song
Wei Yang
Mamba
248
29
0
28 May 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
271
1
0
28 May 2024
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
Hao Dong
Yue Zhao
Eleni Chatzi
Olga Fink
OODD
218
25
0
27 May 2024
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
Shuangpeng Han
Ziyu Wang
Mengmi Zhang
275
1
0
26 May 2024
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
Chau Pham
Bryan A. Plummer
243
8
0
26 May 2024
Planted: a dataset for planted forest identification from multi-satellite time series
L. M. Pazos-Outón
Cristina Nader Vasconcelos
Anton Raichuk
Anurag Arnab
Dan Morris
Maxim Neumann
200
8
0
24 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Yaoyao Liu
Cihang Xie
AI4TS
VGen
SSL
216
2
0
24 May 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
347
30
0
22 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
430
7
0
21 May 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
ViT
241
14
0
18 May 2024
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Matthew Korban
Peter Youngs
Scott T. Acton
ViT
247
16
0
13 May 2024
Deep video representation learning: a survey
Elham Ravanbakhsh
Yongqing Liang
J. Ramanujam
Xin Li
217
5
0
10 May 2024
Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation
Mo Guan
Yan Wang
Guangkun Ma
Jiarui Liu
Mingzu Sun
SLR
217
14
0
09 May 2024
A Survey on Backbones for Deep Video Action Recognition
Zixuan Tang
Youjun Zhao
Yuhang Wen
Mengyuan Liu
176
3
0
09 May 2024
MERIT: Multi-view evidential learning for reliable and interpretable liver fibrosis staging
Yuanye Liu
Zheyao Gao
Nannan Shi
Fuping Wu
Yuxin Shi
Qingchao Chen
Xiahai Zhuang
360
0
0
05 May 2024
Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy
Hoang-Quan Nguyen
Thanh-Dat Truong
Khoa Luu
287
1
0
02 May 2024
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
433
6
0
18 Apr 2024
Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD)
Yiqiao Yin
131
0
0
15 Apr 2024
A Survey on Multimodal Wearable Sensor-based Human Action Recognition
Jianyuan Ni
Hao Tang
Syed Tousiful Haque
Yan Yan
A. Ngu
298
22
0
14 Apr 2024
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
157
5
0
13 Apr 2024
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
Swati Jindal
Mohit Yadav
Roberto Manduchi
159
12
0
08 Apr 2024
Towards more realistic human motion prediction with attention to motion coordination
Pengxiang Ding
Jianqin Yin
154
20
0
04 Apr 2024
Language Model Guided Interpretable Video Action Reasoning
Computer Vision and Pattern Recognition (CVPR), 2024
Ning Wang
Guangming Zhu
HS Li
Liang Zhang
Syed Afaq Ali Shah
Mohammed Bennamoun
226
7
0
02 Apr 2024
Dual DETRs for Multi-Label Temporal Action Detection
Yuhan Zhu
Guozhen Zhang
Jing Tan
Gangshan Wu
Limin Wang
250
22
0
31 Mar 2024
Hypergraph-based Multi-View Action Recognition using Event Cameras
Yue Gao
Jiaxuan Lu
Siqi Li
Yipeng Li
Shaoyi Du
320
25
0
28 Mar 2024
OmniVid: A Generative Framework for Universal Video Understanding
Junke Wang
Dongdong Chen
Chong Luo
Bo He
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
VLM
VGen
285
29
0
26 Mar 2024
Emotion Recognition from the perspective of Activity Recognition
Savinay Nagendra
Prapti Panigrahi
143
2
0
24 Mar 2024
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Hui Lu
Hu Jian
Ronald Poppe
A. A. Salah
221
6
0
24 Mar 2024
Towards Two-Stream Foveation-based Active Vision Learning
IEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2024
Timur Ibrayev
Amitangshu Mukherjee
Sai Aparna Aketi
Kaushik Roy
327
5
0
24 Mar 2024
VidLA: Video-Language Alignment at Scale
Computer Vision and Pattern Recognition (CVPR), 2024
Mamshad Nayeem Rizve
Fan Fei
Jayakrishnan Unnikrishnan
Son Tran
Benjamin Z. Yao
Belinda Zeng
Mubarak Shah
Trishul Chilimbi
VLM
AI4TS
225
8
0
21 Mar 2024
Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
Filip Ilic
Henghui Zhao
Thomas Pock
Richard P. Wildes
PICV
AAML
253
3
0
19 Mar 2024
VideoBadminton: A Video Dataset for Badminton Action Recognition
Qi Li
Tzu-Chen Chiu
Hsiang-Wei Huang
Minmin Sun
Wei-Shinn Ku
182
1
0
19 Mar 2024
Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception
Vijay John
Yasutomo Kawanishi
178
0
0
18 Mar 2024
A Survey of IMU Based Cross-Modal Transfer Learning in Human Activity Recognition
Abhi Kamboj
Minh Do
217
5
0
17 Mar 2024
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu
Yikun Liu
Fei Zhang
Chen Ju
Ya Zhang
Yanfeng Wang
327
27
0
17 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
European Conference on Computer Vision (ECCV), 2024
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
219
13
0
14 Mar 2024
SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net
IEEE International Conference on Robotics and Automation (ICRA), 2024
Helin Cao
Sven Behnke
3DPC
3DGS
152
11
0
13 Mar 2024
VideoMamba: State Space Model for Efficient Video Understanding
European Conference on Computer Vision (ECCV), 2024
Kunchang Li
Xinhao Li
Yi Wang
Yinan He
Yali Wang
Limin Wang
Yu Qiao
Mamba
283
390
0
11 Mar 2024
Deep Learning Approaches for Human Action Recognition in Video Data
Yufei Xie
145
1
0
11 Mar 2024
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition
Erkut Akdag
Zeqi Zhu
Egor Bondarev
Peter H. N. de With
ViT
250
8
0
11 Mar 2024
A spatiotemporal style transfer algorithm for dynamic visual stimulus generation
Nature Computational Science (Nat. Comput. Sci.), 2024
Antonino Greco
Markus Siegel
222
7
0
07 Mar 2024
Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits
Sahil Sidheekh
Pranuthi Tenali
Saurabh Mathur
Erik Blasch
Kristian Kersting
S. Natarajan
230
4
0
05 Mar 2024
Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams
Duy Tran Thanh
Yeejin Lee
Byeongkeun Kang
303
5
0
05 Mar 2024
Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition
Kun-Yu Lin
Henghui Ding
Jiaming Zhou
Yu-Ming Tang
Yi-Xing Peng
Zhilin Zhao
Chen Change Loy
Wei-Shi Zheng
VLM
344
21
0
03 Mar 2024
BEE-NET: A deep neural network to identify in-the-wild Bodily Expression of Emotions
Mohammad Mahdi Dehshibi
David Masip
200
2
0
21 Feb 2024
Previous
1
2
3
4
5
...
45
46
47
Next