ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2199
  4. Cited By
Two-Stream Convolutional Networks for Action Recognition in Videos
v1v2 (latest)

Two-Stream Convolutional Networks for Action Recognition in Videos

Neural Information Processing Systems (NeurIPS), 2014
9 June 2014
Karen Simonyan
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Two-Stream Convolutional Networks for Action Recognition in Videos"

50 / 2,340 papers shown
Title
Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition
Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition
Novanto Yudistira
88
0
0
04 Dec 2025
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Chenshuang Zhang
Kang Zhang
Joon Son Chung
In So Kweon
Junmo Kim
Chengzhi Mao
DiffM
212
0
0
02 Dec 2025
Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos
Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos
Xavier Thomas
Youngsun Lim
Ananya Srinivasan
Audrey Zheng
Deepti Ghadiyaram
EGVMVGen
312
0
0
01 Dec 2025
Beyond Real versus Fake Towards Intent-Aware Video Analysis
Beyond Real versus Fake Towards Intent-Aware Video Analysis
Saurabh Atreya
Nabyl Quignon
Baptiste Chopin
Abhijit Das
A. Dantcheva
AAML
56
0
0
27 Nov 2025
Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition
Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition
Baoli Sun
Y. X. R. Wang
Xinzhu Ma
Zhihui Wang
Kun Lu
Zhiyong Wang
190
0
0
26 Nov 2025
Smooth regularization for efficient video recognition
Smooth regularization for efficient video recognition
Gil Goldman
Raja Giryes
Mahadev Satyanarayanan
AI4TS
171
0
0
25 Nov 2025
Auto-US: An Ultrasound Video Diagnosis Agent Using Video Classification Framework and LLMs
Auto-US: An Ultrasound Video Diagnosis Agent Using Video Classification Framework and LLMs
Yuezhe Yang
Yiyue Guo
Wenjie Cai
Qingqing Ruan
Siying Wang
Xingbo Dong
Zhe Jin
Yong Dai
104
0
0
11 Nov 2025
Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition
Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition
Nicholas Babey
Tiffany Gu
Yiheng Li
Cristian Meo
Kevin Zhu
92
0
0
06 Nov 2025
Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition
Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition
Jongseo Lee
Wooil Lee
Gyeong-Moon Park
Seong Tae Kim
J. Choi
136
0
0
05 Nov 2025
A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential
A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential
Mehdi Sefidgar Dilmaghani
Francis Fowley
Peter Corcoran
112
0
0
05 Nov 2025
M3PD Dataset: Dual-view Photoplethysmography (PPG) Using Front-and-rear Cameras of Smartphones in Lab and Clinical Settings
M3PD Dataset: Dual-view Photoplethysmography (PPG) Using Front-and-rear Cameras of Smartphones in Lab and Clinical Settings
Jiankai Tang
Tao Zhang
Jia Li
Y. Zhang
Mingyu Zhang
...
Haiyang Li
X. Wang
Yuanchun Shi
Y. Wang
Sichong Qian
216
1
0
04 Nov 2025
FLoC: Facility Location-Based Efficient Visual Token Compression for Long Video Understanding
FLoC: Facility Location-Based Efficient Visual Token Compression for Long Video Understanding
Janghoon Cho
Jungsoo Lee
Munawar Hayat
Kyuwoong Hwang
Fatih Porikli
Sungha Choi
52
0
0
31 Oct 2025
A Renaissance of Explicit Motion Information Mining from Transformers for Action Recognition
A Renaissance of Explicit Motion Information Mining from Transformers for Action Recognition
Peiqin Zhuang
Wenlong Zhang
Yichao Wu
Ding Liang
Luping Zhou
Yali Wang
Wanli Ouyang
175
0
0
21 Oct 2025
MAVR-Net: Robust Multi-View Learning for MAV Action Recognition with Cross-View Attention
MAVR-Net: Robust Multi-View Learning for MAV Action Recognition with Cross-View Attention
Nengbo Zhang
Hann Woei Ho
132
1
0
17 Oct 2025
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal Modeling
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal ModelingComputer Vision and Image Understanding (CVIU), 2025
Tim J. Schoonbeek
Shao-Hsuan Hung
Dan Lehman
H. Onvlee
Jacek Kustra
Peter H. N. de With
Fons van der Sommen
108
0
0
14 Oct 2025
Two-stream network-driven vision-based tactile sensor for object feature extraction and fusion perception
Two-stream network-driven vision-based tactile sensor for object feature extraction and fusion perception
Muxing Huang
Zibin Chen
Weiliang Xu
Zilan Li
Yuanzhi Zhou
Guoyuan Zhou
Wenjing Chen
Xinming Li
104
0
0
14 Oct 2025
Mixup Helps Understanding Multimodal Video Better
Mixup Helps Understanding Multimodal Video Better
Xiaoyu Ma
Ding Ding
Hao Chen
116
0
0
13 Oct 2025
SAM2-3dMed: Empowering SAM2 for 3D Medical Image Segmentation
SAM2-3dMed: Empowering SAM2 for 3D Medical Image Segmentation
Yeqing Yang
Le Xu
Lixia Tian
MedIm
64
0
0
10 Oct 2025
Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization
Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization
Shuo Xing
Soumik Dey
Mingyang Wu
Ashirbad Mishra
Naveen Ravipati
Binbin Li
Hansi Wu
Zhengzhong Tu
171
1
0
09 Oct 2025
Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow
Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow
Ruyang Liu
Shangkun Sun
Haoran Tang
Ge Li
Wei-Nan Gao
VGenVLM
88
3
0
07 Oct 2025
REALIGN: Regularized Procedure Alignment with Matching Video Embeddings via Partial Gromov-Wasserstein Optimal Transport
REALIGN: Regularized Procedure Alignment with Matching Video Embeddings via Partial Gromov-Wasserstein Optimal Transport
Soumyadeep Chandra
Kaushik Roy
101
0
0
29 Sep 2025
Prompt-guided Disentangled Representation for Action Recognition
Prompt-guided Disentangled Representation for Action Recognition
Tianci Wu
Guangming Zhu
Jiang Lu
Siyuan Wang
Ning Wang
Nuoye Xiong
Zhang Liang
210
0
0
26 Sep 2025
Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis
Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis
Sai Varun Kodathala
Rakesh Vunnam
84
0
0
25 Sep 2025
Six Sigma For Neural Networks: Taguchi-based optimization
Six Sigma For Neural Networks: Taguchi-based optimization
Sai Varun Kodathala
84
0
0
22 Sep 2025
MoCrop: Training Free Motion Guided Cropping for Efficient Video Action Recognition
MoCrop: Training Free Motion Guided Cropping for Efficient Video Action Recognition
Binhua Huang
Wendong Yao
Shaowu Chen
Guoxin Wang
Qingyuan Wang
Soumyabrata Dev
60
0
0
22 Sep 2025
MoCLIP-Lite: Efficient Video Recognition by Fusing CLIP with Motion Vectors
MoCLIP-Lite: Efficient Video Recognition by Fusing CLIP with Motion Vectors
Binhua Huang
Nan Wang
Arjun Parakash
Soumyabrata Dev
CLIPVLM
85
0
0
21 Sep 2025
LSTC-MDA: A Unified Framework for Long-Short Term Temporal Convolution and Mixed Data Augmentation in Skeleton-Based Action Recognition
LSTC-MDA: A Unified Framework for Long-Short Term Temporal Convolution and Mixed Data Augmentation in Skeleton-Based Action Recognition
Feng Ding
H. Fu
Soroush Oraki
Jie Liang
56
0
0
18 Sep 2025
Vi-SAFE: A Spatial-Temporal Framework for Efficient Violence Detection in Public Surveillance
Vi-SAFE: A Spatial-Temporal Framework for Efficient Violence Detection in Public Surveillance
Ligang Chang
Shengkai Xu
Liangchang Shen
Binhan Xu
Junqiao Wang
Lewei He
Yanhui Du
56
0
0
16 Sep 2025
Enhancing Video Large Language Models with Structured Multi-Video Collaborative Reasoning
Enhancing Video Large Language Models with Structured Multi-Video Collaborative Reasoning
Zhihao He
Tianyao He
Yun Xu
Yun Xu
Huabin Liu
Chaofan Gan
Gui Zou
W. Lin
152
2
0
16 Sep 2025
Video Understanding by Design: How Datasets Shape Architectures and Insights
Video Understanding by Design: How Datasets Shape Architectures and Insights
Lei Wang
Piotr Koniusz
Yongsheng Gao
3DVVGenAI4TS
233
0
0
11 Sep 2025
Dual-Model Weight Selection and Self-Knowledge Distillation for Medical Image Classification
Dual-Model Weight Selection and Self-Knowledge Distillation for Medical Image Classification
Ayaka Tsutsumi
Guang Li
Ren Togo
Takahiro Ogawa
Satoshi Kondo
Miki Haseyama
92
0
0
28 Aug 2025
A Novel Deep Hybrid Framework with Ensemble-Based Feature Optimization for Robust Real-Time Human Activity Recognition
A Novel Deep Hybrid Framework with Ensemble-Based Feature Optimization for Robust Real-Time Human Activity Recognition
Wasi Ullah
Yasir Noman Khalid
Saddam Hussain Khan
158
2
0
26 Aug 2025
Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?
Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?Social Science Research Network (SSRN), 2025
Fatemeh Ziaeetabar
100
0
0
25 Aug 2025
Aligning Moments in Time using Video Queries
Aligning Moments in Time using Video Queries
Yogesh Kumar
Uday Agarwal
Manish Gupta
Anand Mishra
259
1
0
21 Aug 2025
Generative Model-Based Feature Attention Module for Video Action Analysis
Generative Model-Based Feature Attention Module for Video Action Analysis
G. Wang
Peng Zhao
Cong Zhao
Jing Huang
Siyan Guo
Shusen Yang
112
0
0
19 Aug 2025
ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning
ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning
Jongseo Lee
Kyungho Bae
Kyle Min
Gyeong-Moon Park
J. Choi
CLLVLM
175
0
0
14 Aug 2025
Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition
Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition
Pulkit Kumar
Shuaiyi Huang
Matthew Walmer
Sai Saketh Rambhatla
Abhinav Shrivastava
ViT
163
2
0
05 Aug 2025
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
Shijie Zhou
Alexander Vilesov
Xuehai He
Ziyu Wan
Shuwang Zhang
Aditya Nagachandra
Di Chang
DongDong Chen
Xin Eric Wang
A. Kadambi
VLM
174
14
0
04 Aug 2025
Efficient Spatial-Temporal Modeling for Real-Time Video Analysis: A Unified Framework for Action Recognition and Object Tracking
Efficient Spatial-Temporal Modeling for Real-Time Video Analysis: A Unified Framework for Action Recognition and Object Tracking
Shahla John
111
1
0
30 Jul 2025
Dual Guidance Semi-Supervised Action Detection
Dual Guidance Semi-Supervised Action Detection
Ankit Singh
E. Gavves
Cees G. M. Snoek
Hilde Kuehne
143
0
0
28 Jul 2025
HumanSAM: Classifying Human-centric Forgery Videos in Human Spatial, Appearance, and Motion Anomaly
HumanSAM: Classifying Human-centric Forgery Videos in Human Spatial, Appearance, and Motion Anomaly
Chang Liu
Yunfan Ye
Fan Zhang
Q. Zhou
Yuchuan Luo
Zhiping Cai
235
1
0
26 Jul 2025
SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities
SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities
Yasser Ashraf
Ahmed Sharshar
V. Bojkovic
Bin Gu
122
0
0
22 Jul 2025
Procedure Learning via Regularized Gromov-Wasserstein Optimal Transport
Procedure Learning via Regularized Gromov-Wasserstein Optimal Transport
Syed Ahmed Mahmood
Ali Shah Ali
Umer Ahmed
Fawad Javed Fateh
M. Zia
Quoc-Huy Tran
163
2
0
21 Jul 2025
DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
Xiaoyi Bao
Chenwei Xie
Hao Tang
Tingyu Weng
Xiaofeng Wang
Yun Zheng
Xingang Wang
VGen
135
1
0
21 Jul 2025
Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges
Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges
Sanjeda Akter
Ibne Farabi Shihab
Anuj Sharma
VLM
293
2
0
02 Jul 2025
Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature Alignment
Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature AlignmentIEEE Transactions on Image Processing (IEEE TIP), 2025
Kai Zhou
Shuhai Zhang
Zeng You
Jinwu Hu
Mingkui Tan
Fei Liu
219
0
0
01 Jul 2025
D$^2$ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
D2^22ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
Wenjie Pei
Qizhong Tan
Guangming Lu
Jiandong Tian
Jun Yu
468
2
0
01 Jul 2025
ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment
ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment
Amir Aghdam
Vincent Tao Hu
Bjorn Ommer
VLM
251
2
0
28 Jun 2025
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Xiaodan Hu
Chuhang Zou
Suchen Wang
Jaechul Kim
Narendra Ahuja
LRM
167
0
0
20 Jun 2025
An Effective End-to-End Solution for Multimodal Action RecognitionInternational Conference on Pattern Recognition (ICPR), 2025
Songping Wang
Xiantao Hu
Yueming Lyu
Caifeng Shan
223
2
0
11 Jun 2025
1234...454647
Next