ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.01549
  4. Cited By
StNet: Local and Global Spatial-Temporal Modeling for Action Recognition
v1v2v3 (latest)

StNet: Local and Global Spatial-Temporal Modeling for Action Recognition

5 November 2018
Dongliang He
Zhichao Zhou
Chuang Gan
Fu Li
Xiao-Chang Liu
Yandong Li
Limin Wang
Shilei Wen
ArXiv (abs)PDFHTML

Papers citing "StNet: Local and Global Spatial-Temporal Modeling for Action Recognition"

50 / 50 papers shown
Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools
Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools
Zhenlong Yuan
Xiangyan Qu
Chengxuan Qian
Rui Chen
Jing Tang
...
Xiangxiang Chu
Dapeng Zhang
Yiwei Wang
Y. Cai
Shuo Li
VLMLRM
203
21
0
09 Oct 2025
VT-LVLM-AR: A Video-Temporal Large Vision-Language Model Adapter for Fine-Grained Action Recognition in Long-Term Videos
VT-LVLM-AR: A Video-Temporal Large Vision-Language Model Adapter for Fine-Grained Action Recognition in Long-Term Videos
Kaining Li
Shuwei He
Zihan Xu
VLM
131
1
0
21 Aug 2025
Dynamic and Compressive Adaptation of Transformers From Images to Videos
Dynamic and Compressive Adaptation of Transformers From Images to Videos
Guozhen Zhang
Jingyu Liu
Shengming Cao
Xiaotong Zhao
Kevin Zhao
Kai Ma
Limin Wang
ViT
521
2
0
13 Aug 2024
Brain-inspired Computational Modeling of Action Recognition with
  Recurrent Spiking Neural Networks Equipped with Reinforcement Delay Learning
Brain-inspired Computational Modeling of Action Recognition with Recurrent Spiking Neural Networks Equipped with Reinforcement Delay Learning
Alireza Nadafian
Milad Mozafari
T. Masquelier
M. Ganjtabesh
144
1
0
17 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a
  Hybrid Model
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedImViT
336
46
0
02 Jun 2024
Deep video representation learning: a survey
Deep video representation learning: a survey
Elham Ravanbakhsh
Yongqing Liang
J. Ramanujam
Xin Li
280
7
0
10 May 2024
What Can Simple Arithmetic Operations Do for Temporal Modeling?
What Can Simple Arithmetic Operations Do for Temporal Modeling?IEEE International Conference on Computer Vision (ICCV), 2023
Wenhao Wu
Yuxin Song
Zhun Sun
Jingdong Wang
Chang Xu
Wanli Ouyang
264
17
0
18 Jul 2023
Deep set conditioned latent representations for action recognition
Deep set conditioned latent representations for action recognitionVISIGRAPP (VISIGRAPP), 2022
Akash Singh
Tom De Schepper
Kevin Mets
P. Hellinckx
José Oramas
Steven Latré
BDL
240
2
0
21 Dec 2022
Dynamic Appearance: A Video Representation for Action Recognition with
  Joint Training
Dynamic Appearance: A Video Representation for Action Recognition with Joint Training
Guoxi Huang
A. Bors
261
1
0
23 Nov 2022
A Unified Multimodal De- and Re-coupling Framework for RGB-D Motion
  Recognition
A Unified Multimodal De- and Re-coupling Framework for RGB-D Motion RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
251
29
0
16 Nov 2022
FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial
  Video Classification
FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial Video ClassificationIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022
P. Jin
Lichao Mou
Yuansheng Hua
Gui-Song Xia
Xiao Xiang Zhu
AI4TS
392
17
0
22 Sep 2022
MAiVAR: Multimodal Audio-Image and Video Action Recognizer
MAiVAR: Multimodal Audio-Image and Video Action RecognizerVisual Communications and Image Processing (VCIP), 2022
Muhammad Bilal Shaikh
Douglas Chai
S. Islam
Naveed Akhtar
240
6
0
11 Sep 2022
Adversarial Feature Augmentation for Cross-domain Few-shot
  Classification
Adversarial Feature Augmentation for Cross-domain Few-shot ClassificationEuropean Conference on Computer Vision (ECCV), 2022
Yan Hu
A. J. Ma
334
86
0
23 Aug 2022
Human Activity Recognition Using Cascaded Dual Attention CNN and
  Bi-Directional GRU Framework
Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU FrameworkJournal of Imaging (JI), 2022
Hayat Ullah
Arslan Munir
HAI
235
44
0
09 Aug 2022
VidConv: A modernized 2D ConvNet for Efficient Video Recognition
VidConv: A modernized 2D ConvNet for Efficient Video Recognition
Chuong H. Nguyen
Su Huynh
Vinh Nguyen
Ngoc-Khanh Nguyen
ViT
216
3
0
08 Jul 2022
Behavior Recognition Based on the Integration of Multigranular Motion
  Features
Behavior Recognition Based on the Integration of Multigranular Motion Features
Lizong Zhang
Yiming Wang
Bei Hui
Xiu Zhang
Sijuan Liu
Shuxin Feng
137
0
0
07 Mar 2022
Attention-Based Sensor Fusion for Human Activity Recognition Using IMU
  Signals
Attention-Based Sensor Fusion for Human Activity Recognition Using IMU Signals
Wenjin Tao
Haodong Chen
Md Moniruzzaman
M. C. Leu
Zhaozheng Yi
Ruwen Qin
131
16
0
20 Dec 2021
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based
  Motion Recognition
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition
Benjia Zhou
Pichao Wang
Jun Wan
Yanyan Liang
Fan Wang
Du Zhang
Zhen Lei
Hao Li
Rong Jin
257
37
0
16 Dec 2021
Temporal Transformer Networks with Self-Supervision for Action
  Recognition
Temporal Transformer Networks with Self-Supervision for Action Recognition
Yongkang Zhang
Jun Li
Guoming Wu
Hanjie Zhang
Zhiping Shi
Zhaoxun Liu
Zizhang Wu
ViT
313
9
0
14 Dec 2021
STSM: Spatio-Temporal Shift Module for Efficient Action Recognition
STSM: Spatio-Temporal Shift Module for Efficient Action Recognition
Zhaoqilin Yang
Gaoyun An
292
6
0
05 Dec 2021
Stacked Temporal Attention: Improving First-person Action Recognition by
  Emphasizing Discriminative Clips
Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips
Lijin Yang
Yifei Huang
Yusuke Sugano
Yoichi Sato
273
6
0
02 Dec 2021
GTM: Gray Temporal Model for Video Recognition
GTM: Gray Temporal Model for Video Recognition
Yanping Zhang
Yongxin Yu
152
1
0
20 Oct 2021
Video Is Graph: Structured Graph Module for Video Action Recognition
Video Is Graph: Structured Graph Module for Video Action Recognition
Rongjie Li
Xiaojun Wu
Tianyang Xu
493
15
0
12 Oct 2021
Long-Short Temporal Modeling for Efficient Action Recognition
Long-Short Temporal Modeling for Efficient Action RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Liyu Wu
Yuexian Zou
Can Zhang
123
1
0
30 Jun 2021
TSI: Temporal Saliency Integration for Video Action Recognition
TSI: Temporal Saliency Integration for Video Action Recognition
Haisheng Su
Kunchang Li
Jinyuan Feng
Dongliang Wang
Weihao Gan
Wei Wu
Yu Qiao
272
5
0
02 Jun 2021
DSANet: Dynamic Segment Aggregation Network for Video-Level
  Representation Learning
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation LearningACM Multimedia (ACM MM), 2021
Wenhao Wu
Yuxiang Zhao
Yanwu Xu
Xiao Tan
Dongliang He
...
Jinxing Ye
Yingying Li
Mingde Yao
Zichao Dong
Yifeng Shi
AI4TS
310
34
0
25 May 2021
Busy-Quiet Video Disentangling for Video Classification
Busy-Quiet Video Disentangling for Video ClassificationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Guoxi Huang
A. Bors
339
10
0
29 Mar 2021
Dynamic Neural Networks: A Survey
Dynamic Neural Networks: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yizeng Han
Gao Huang
Shiji Song
Le Yang
Honghui Wang
Yulin Wang
3DHAI4TSAI4CE
625
863
0
09 Feb 2021
TDN: Temporal Difference Networks for Efficient Action Recognition
TDN: Temporal Difference Networks for Efficient Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2020
Limin Wang
Zhan Tong
Bin Ji
Gangshan Wu
609
477
0
18 Dec 2020
Recent Progress in Appearance-based Action Recognition
Recent Progress in Appearance-based Action Recognition
J. Humphreys
Zhe Chen
Dacheng Tao
222
0
0
25 Nov 2020
Actor and Action Modular Network for Text-based Video Segmentation
Actor and Action Modular Network for Text-based Video SegmentationIEEE Transactions on Image Processing (TIP), 2020
Jianhua Yang
Yan Huang
K. Niu
Linjiang Huang
Zhanyu Ma
Liang Wang
335
15
0
02 Nov 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action
  Recognition
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Yikang Shen
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
392
118
0
22 Oct 2020
Video Action Understanding
Video Action Understanding
Matthew Hutchinson
V. Gadepally
414
27
0
13 Oct 2020
Approximated Bilinear Modules for Temporal Modeling
Approximated Bilinear Modules for Temporal ModelingIEEE International Conference on Computer Vision (ICCV), 2019
Xinqi Zhu
Chang Xu
Langwen Hui
Cewu Lu
Dacheng Tao
179
27
0
25 Jul 2020
AttentionNAS: Spatiotemporal Attention Cell Search for Video
  Classification
AttentionNAS: Spatiotemporal Attention Cell Search for Video ClassificationEuropean Conference on Computer Vision (ECCV), 2020
Xiaofang Wang
Xuehan Xiong
Maxim Neumann
A. Piergiovanni
Michael S. Ryoo
A. Angelova
Kris Kitani
Wei Hua
358
53
0
23 Jul 2020
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human
  Action Recognition
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition
Sudhakar Kumawat
Manisha Verma
Yuta Nakashima
Shanmuganathan Raman
357
50
0
22 Jul 2020
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Spatiotemporal Fusion in 3D CNNs: A Probabilistic ViewComputer Vision and Pattern Recognition (CVPR), 2020
Yizhou Zhou
Xiaoyan Sun
Chong Luo
Zhengjun Zha
Wenjun Zeng
3DPC
212
24
0
10 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
TEA: Temporal Excitation and Aggregation for Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2020
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
447
554
0
03 Apr 2020
STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition
STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition
Xu Li
Jingwen Wang
Lin Ma
Kaihao Zhang
Fengzong Lian
Zhanhui Kang
Jinjun Wang
187
5
0
18 Mar 2020
Knowledge Integration Networks for Action Recognition
Knowledge Integration Networks for Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2020
Shiwen Zhang
Sheng Guo
Limin Wang
Weilin Huang
Matthew R. Scott
275
20
0
18 Feb 2020
Dynamic Inference: A New Approach Toward Efficient Video Action
  Recognition
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Yi Yang
Shilei Wen
201
37
0
09 Feb 2020
CTM: Collaborative Temporal Modeling for Action Recognition
CTM: Collaborative Temporal Modeling for Action Recognition
Li-Yu Daisy Liu
Tao Wang
Jie Liu
Yang Guan
Qi Bu
Longfei Yang
TTA
131
0
0
08 Feb 2020
iqiyi Submission to ActivityNet Challenge 2019 Kinetics-700 challenge:
  Hierarchical Group-wise Attention
iqiyi Submission to ActivityNet Challenge 2019 Kinetics-700 challenge: Hierarchical Group-wise Attention
Li-Yu Daisy Liu
Dongyang Cai
Jie Liu
Nan Ding
Tao Wang
112
0
0
07 Feb 2020
TEINet: Towards an Efficient Architecture for Video Recognition
TEINet: Towards an Efficient Architecture for Video RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2019
Zhaoyang Liu
Donghao Luo
Yabiao Wang
Limin Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Tong Lu
ViT
212
267
0
21 Nov 2019
Multi-Label Classification with Label Graph Superimposing
Multi-Label Classification with Label Graph SuperimposingAAAI Conference on Artificial Intelligence (AAAI), 2019
Ya Wang
Dongliang He
Fu Li
Xiang Long
Zhichao Zhou
Jinwen Ma
Shilei Wen
248
191
0
21 Nov 2019
STM: SpatioTemporal and Motion Encoding for Action Recognition
STM: SpatioTemporal and Motion Encoding for Action RecognitionIEEE International Conference on Computer Vision (ICCV), 2019
Boyuan Jiang
Mengmeng Wang
Weihao Gan
Wei Wu
Junjie Yan
531
442
0
07 Aug 2019
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective
  Untrimmed Video Recognition
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video RecognitionIEEE International Conference on Computer Vision (ICCV), 2019
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Shilei Wen
266
134
0
31 Jul 2019
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Only Time Can Tell: Discovering Temporal Data for Temporal ModelingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2019
Laura Sevilla-Lara
Shengxin Cindy Zha
Zhicheng Yan
Vedanuj Goswami
Matt Feiszli
Lorenzo Torresani
381
90
0
19 Jul 2019
Towards Real-Time Action Recognition on Mobile Devices Using Deep Models
Towards Real-Time Action Recognition on Mobile Devices Using Deep Models
Chen-Da Liu-Zhang
Xin-Xin Liu
Jianxin Wu
HAI
197
9
0
17 Jun 2019
Interaction-aware Spatio-temporal Pyramid Attention Networks for Action
  Classification
Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification
Yang Du
Chunfen Yuan
Bing Li
Lili Zhao
Yangxi Li
Weiming Hu
380
86
0
03 Aug 2018
1
Page 1 of 1