ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2199
  4. Cited By
Two-Stream Convolutional Networks for Action Recognition in Videos

Two-Stream Convolutional Networks for Action Recognition in Videos

9 June 2014
Karen Simonyan
Andrew Zisserman
ArXivPDFHTML

Papers citing "Two-Stream Convolutional Networks for Action Recognition in Videos"

50 / 2,275 papers shown
Title
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video
  Large Language Models
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Haibo Wang
Zhiyang Xu
Yu Cheng
Shizhe Diao
Yufan Zhou
Yixin Cao
Qifan Wang
Weifeng Ge
Lifu Huang
24
21
0
04 Oct 2024
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
Mohammad Mahdavian
Mohammad Loni
Mo Chen
Mo Chen
28
0
0
02 Oct 2024
REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for
  Treatment of Hands after Surviving Stroke
REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke
Wiktor Mucha
Kentaro Tanaka
M. Kampel
42
0
0
30 Sep 2024
Egocentric zone-aware action recognition across environments
Egocentric zone-aware action recognition across environments
Simone Alberto Peirone
Gabriele Goletto
M. Planamente
A. Bottino
Barbara Caputo
Giuseppe Averta
EgoV
33
0
0
21 Sep 2024
BurstM: Deep Burst Multi-scale SR using Fourier Space with Optical Flow
BurstM: Deep Burst Multi-scale SR using Fourier Space with Optical Flow
EungGu Kang
Byeonghun Lee
Sunghoon Im
Kyong Hwan Jin
SupR
33
4
0
21 Sep 2024
Generating Event-oriented Attribution for Movies via Two-Stage
  Prefix-Enhanced Multimodal LLM
Generating Event-oriented Attribution for Movies via Two-Stage Prefix-Enhanced Multimodal LLM
Yuanjie Lyu
Tong Xu
Zihan Niu
Bo Peng
Jing Ke
Enhong Chen
28
0
0
14 Sep 2024
2D bidirectional gated recurrent unit convolutional Neural networks for
  end-to-end violence detection In videos
2D bidirectional gated recurrent unit convolutional Neural networks for end-to-end violence detection In videos
Abdarahmane Traoré
M. Akhloufi
24
13
0
11 Sep 2024
Real-Time Human Action Recognition on Embedded Platforms
Real-Time Human Action Recognition on Embedded Platforms
Ruiqi Wang
Zichen Wang
Peiqi Gao
Mingzhen Li
Jaehwan Jeong
Yihang Xu
Yejin Lee
Carolyn M. Baum
Lisa Connor
Chenyang Lu
46
2
0
09 Sep 2024
HMAFlow: Learning More Accurate Optical Flow via Hierarchical Motion
  Field Alignment
HMAFlow: Learning More Accurate Optical Flow via Hierarchical Motion Field Alignment
Dianbo Ma
Kousuke Imamura
Ziyan Gao
Xiangjie Wang
Satoshi Yamane
40
0
0
09 Sep 2024
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Keyne Oei
Amr Gomaa
Anna Maria Feit
João Belo
33
0
0
06 Sep 2024
MVTN: A Multiscale Video Transformer Network for Hand Gesture
  Recognition
MVTN: A Multiscale Video Transformer Network for Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
ViT
38
1
0
05 Sep 2024
Ig3D: Integrating 3D Face Representations in Facial Expression Inference
Ig3D: Integrating 3D Face Representations in Facial Expression Inference
Lu Dong
Xiao Wang
S. Setlur
Venu Govindaraju
Ifeoma Nwogu
3DH
34
0
0
29 Aug 2024
MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of
  Children with Autism Spectrum Disorder
MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum Disorder
Pavan Uttej Ravva
Behdokht Kiafar
Pinar Kullu
Jicheng Li
Anjana Bhat
R. Barmaki
43
0
0
27 Aug 2024
Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Mahrukh Awan
Asmar Nadeem
Muhammad Junaid Awan
Armin Mustafa
Syed Sameed Husain
28
1
0
26 Aug 2024
HabitAction: A Video Dataset for Human Habitual Behavior Recognition
HabitAction: A Video Dataset for Human Habitual Behavior Recognition
Hongwu Li
Zhenliang Zhang
Wei Wang
33
0
0
24 Aug 2024
TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer
  Learning
TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning
Bin Wang
Wenqian Wang
VLM
42
1
0
20 Aug 2024
Flatten: Video Action Recognition is an Image Classification task
Flatten: Video Action Recognition is an Image Classification task
Junlin Chen
Chengcheng Xu
Yangfan Xu
Jian Yang
Jun Yu Li
Zhiping Shi
39
1
0
17 Aug 2024
Weakly Supervised Video Anomaly Detection and Localization with
  Spatio-Temporal Prompts
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts
Peng Wu
Xuerong Zhou
Guansong Pang
Zhiwei Yang
Qingsen Yan
Peng Wang
Yanning Zhang
41
9
0
12 Aug 2024
A Methodological and Structural Review of Hand Gesture Recognition
  Across Diverse Data Modalities
A Methodological and Structural Review of Hand Gesture Recognition Across Diverse Data Modalities
Jungpil Shin
Abu Saleh Musa Miah
Md. Humaun Kabir
M. Rahim
Abdullah Al Shiam
44
12
0
10 Aug 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
31
0
0
10 Aug 2024
MU-MAE: Multimodal Masked Autoencoders-Based One-Shot Learning
MU-MAE: Multimodal Masked Autoencoders-Based One-Shot Learning
Rex Liu
Xin Liu
40
1
0
08 Aug 2024
From Recognition to Prediction: Leveraging Sequence Reasoning for Action
  Anticipation
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
Xin Liu
Chao Hao
Zitong Yu
Huanjing Yue
Jingyu Yang
41
1
0
05 Aug 2024
YOWOv3: An Efficient and Generalized Framework for Human Action
  Detection and Recognition
YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition
Duc Manh Nguyen Dang
Viet-Hang Duong
Jia Ching Wang
Nhan Bui Duc
28
3
0
05 Aug 2024
MPT-PAR:Mix-Parameters Transformer for Panoramic Activity Recognition
MPT-PAR:Mix-Parameters Transformer for Panoramic Activity Recognition
Wenqing Gan
Yaoyu Li
Jian Li
Zhangang Lin
ViT
32
0
0
01 Aug 2024
Segment Anything for Videos: A Systematic Survey
Segment Anything for Videos: A Systematic Survey
Chunhui Zhang
Yawen Cui
Weilin Lin
Guanjie Huang
Yan Rong
Li Liu
Shiguang Shan
VLM
52
6
0
31 Jul 2024
Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
Habib Hajimolahoseini
Walid Ahmed
Austin Wen
Yang Liu
29
0
0
23 Jul 2024
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language
  Models
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Mingze Xu
Mingfei Gao
Zhe Gan
Hong-You Chen
Zhengfeng Lai
Haiming Gang
Kai Kang
Afshin Dehghan
66
49
0
22 Jul 2024
Semi-Supervised Pipe Video Temporal Defect Interval Localization
Semi-Supervised Pipe Video Temporal Defect Interval Localization
Zhu Huang
Gang Pan
Chao Kang
Yaozhi Lv
31
0
0
21 Jul 2024
A Comprehensive Review of Few-shot Action Recognition
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
80
3
0
20 Jul 2024
MLMT-CNN for Object Detection and Segmentation in Multi-layer and
  Multi-spectral Images
MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images
Majedaldein Almahasneh
A. Paiement
Xianghua Xie
Jean Aboudarham
33
4
0
19 Jul 2024
Pose-guided multi-task video transformer for driver action recognition
Pose-guided multi-task video transformer for driver action recognition
Ricardo Pizarro
Roberto Valle
L. Bergasa
J. M. Buenaposada
Luis Baumela
ViT
44
0
0
18 Jul 2024
Improved Esophageal Varices Assessment from Non-Contrast CT Scans
Improved Esophageal Varices Assessment from Non-Contrast CT Scans
Chunli Li
Xiaoming Zhang
Yuan Gao
Xiaoli Yin
Le Lu
Ling Zhang
Ke Yan
Yu Shi
51
0
0
18 Jul 2024
MaskVD: Region Masking for Efficient Video Object Detection
MaskVD: Region Masking for Efficient Video Object Detection
Sreetama Sarkar
Gourav Datta
Souvik Kundu
Kai Zheng
Chirayata Bhattacharyya
P. Beerel
33
3
0
16 Jul 2024
Hypergraph Multi-modal Large Language Model: Exploiting EEG and
  Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video
  Understanding
Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding
Minghui Wu
Chenxu Zhao
Anyang Su
Donglin Di
Tianyu Fu
...
Min He
Ya Gao
Meng Ma
Kun Yan
Ping Wang
35
0
0
11 Jul 2024
Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video
  Dataset
Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset
Rahm Ranjan
David Ahmedt-Aristizabal
M. Armin
Juno Kim
42
4
0
05 Jul 2024
Expressive Keypoints for Skeleton-based Action Recognition via Skeleton
  Transformation
Expressive Keypoints for Skeleton-based Action Recognition via Skeleton Transformation
Yijie Yang
Jinlu Zhang
Jiaxu Zhang
Zhigang Tu
35
5
0
26 Jun 2024
Skim then Focus: Integrating Contextual and Fine-grained Views for
  Repetitive Action Counting
Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting
Zhengqi Zhao
Xiaohu Huang
Hao Zhou
Kun Yao
Errui Ding
Jingdong Wang
Xinggang Wang
Wenyu Liu
Bin Feng
31
1
0
13 Jun 2024
Vision Model Pre-training on Interleaved Image-Text Data via Latent
  Compression Learning
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Chenyu Yang
Xizhou Zhu
Jinguo Zhu
Weijie Su
Junjie Wang
...
Lewei Lu
Bin Li
Jie Zhou
Yu Qiao
Jifeng Dai
VLM
CLIP
47
5
0
11 Jun 2024
Motion Consistency Model: Accelerating Video Diffusion with Disentangled
  Motion-Appearance Distillation
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
Yuanhao Zhai
Kevin Lin
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Chung-Ching Lin
David Doermann
Junsong Yuan
Lijuan Wang
VGen
DiffM
46
9
0
11 Jun 2024
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World
  Egocentric Action Recognition
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World Egocentric Action Recognition
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LM&Ro
LRM
43
0
0
09 Jun 2024
Video-Language Understanding: A Survey from Model Architecture, Model
  Training, and Data Perspectives
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
Thong Nguyen
Yi Bin
Junbin Xiao
Leigang Qu
Yicong Li
Jay Zhangjie Wu
Cong-Duy Nguyen
See-Kiong Ng
Luu Anh Tuan
VLM
61
10
1
09 Jun 2024
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the
  Dark
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark
Chi-Jui Chang
Oscar Tai-Yuan Chen
Vincent S. Tseng
VLM
36
2
0
04 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a
  Hybrid Model
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
48
7
0
02 Jun 2024
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space
  Model
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model
Wenbing Li
Hang Zhou
Junqing Yu
Zikai Song
Wei Yang
Mamba
59
3
0
28 May 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach
  with Hierarchical Interactions
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
39
0
0
28 May 2024
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
Hao Dong
Yue Zhao
Eleni Chatzi
Olga Fink
OODD
43
11
0
27 May 2024
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to
  Biological Motion Perception
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
Shuangpeng Han
Ziyu Wang
Mengmi Zhang
38
0
0
26 May 2024
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
Chau Pham
Bryan A. Plummer
45
3
0
26 May 2024
Planted: a dataset for planted forest identification from
  multi-satellite time series
Planted: a dataset for planted forest identification from multi-satellite time series
L. M. Pazos-Outón
Cristina Nader Vasconcelos
Anton Raichuk
Anurag Arnab
Dan Morris
Maxim Neumann
47
4
0
24 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video
  Representation Learning
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan Yuille
Cihang Xie
AI4TS
VGen
SSL
59
1
0
24 May 2024
Previous
12345...444546
Next