ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1608.00859
  4. Cited By
Temporal Segment Networks: Towards Good Practices for Deep Action
  Recognition

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

2 August 2016
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
    ViT
ArXiv (abs)PDFHTML

Papers citing "Temporal Segment Networks: Towards Good Practices for Deep Action Recognition"

50 / 1,449 papers shown
Refining Action Boundaries for One-stage Detection
Refining Action Boundaries for One-stage DetectionAdvanced Video and Signal Based Surveillance (AVSS), 2022
Hanyuan Wang
Majid Mirmehdi
Dima Damen
Toby Perrett
ObjD
153
1
0
25 Oct 2022
GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online
  Action Prediction
GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action PredictionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Samrudhdhi B. Rangrej
Kevin J. Liang
Tal Hassner
James J. Clark
288
4
0
24 Oct 2022
Anticipative Feature Fusion Transformer for Multi-Modal Action
  Anticipation
Anticipative Feature Fusion Transformer for Multi-Modal Action AnticipationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Zeyun Zhong
David Schneider
Michael Voit
Rainer Stiefelhagen
Jürgen Beyerer
181
61
0
23 Oct 2022
Grounded Video Situation Recognition
Grounded Video Situation RecognitionNeural Information Processing Systems (NeurIPS), 2022
Zeeshan Khan
C. V. Jawahar
Makarand Tapaswi
192
16
0
19 Oct 2022
FedForgery: Generalized Face Forgery Detection with Residual Federated
  Learning
FedForgery: Generalized Face Forgery Detection with Residual Federated LearningIEEE Transactions on Information Forensics and Security (IEEE TIFS), 2022
Decheng Liu
Zhan Dang
Chunlei Peng
Yu Zheng
Shuang Li
N. Wang
Xinbo Gao
FedML
319
57
0
18 Oct 2022
Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows
Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows
Anyi Rao
Xuekun Jiang
Sichen Wang
Yuwei Guo
Zihao Liu
Bo Dai
Long Pang
Xiaoyu Wu
Dahua Lin
Libiao Jin
183
9
0
17 Oct 2022
Semantic Video Moments Retrieval at Scale: A New Task and a Baseline
Semantic Video Moments Retrieval at Scale: A New Task and a Baseline
Na Li
240
0
0
15 Oct 2022
MMTSA: Multimodal Temporal Segment Attention Network for Efficient Human
  Activity Recognition
MMTSA: Multimodal Temporal Segment Attention Network for Efficient Human Activity RecognitionProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2022
Ziqi Gao
Yuntao wang
Jianguo Chen
Junliang Xing
Shwetak N. Patel
Xin Liu
Yuanchun Shi
214
8
0
14 Oct 2022
LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long
  Livestream Videos
LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream VideosIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Jielin Qiu
Franck Dernoncourt
Trung Bui
Zhaowen Wang
Ding Zhao
Hailin Jin
AI4TS
149
7
0
12 Oct 2022
Students taught by multimodal teachers are superior action recognizers
Students taught by multimodal teachers are superior action recognizers
Gorjan Radevski
Dusan Grujicic
Matthew Blaschko
Marie-Francine Moens
Tinne Tuytelaars
211
2
0
09 Oct 2022
Learning Fine-Grained Visual Understanding for Video Question Answering
  via Decoupling Spatial-Temporal Modeling
Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal ModelingBritish Machine Vision Conference (BMVC), 2022
Hsin-Ying Lee
Hung-Ting Su
Bing-Chen Tsai
Tsung-Han Wu
Jia-Fong Yeh
Winston H. Hsu
312
2
0
08 Oct 2022
Multi-Scale Wavelet Transformer for Face Forgery Detection
Multi-Scale Wavelet Transformer for Face Forgery DetectionAsian Conference on Computer Vision (ACCV), 2022
Jie Liu
Jingjing Wang
Peng Zhang
Chunmao Wang
Di Xie
Shiliang Pu
ViTCVBM
234
13
0
08 Oct 2022
Alignment-guided Temporal Attention for Video Action Recognition
Alignment-guided Temporal Attention for Video Action RecognitionNeural Information Processing Systems (NeurIPS), 2022
Yizhou Zhao
Zhenyang Li
Xun Guo
Yan Lu
155
19
0
30 Sep 2022
Learning Transferable Spatiotemporal Representations from Natural Script
  Knowledge
Learning Transferable Spatiotemporal Representations from Natural Script KnowledgeComputer Vision and Pattern Recognition (CVPR), 2022
Ziyun Zeng
Yuying Ge
Xihui Liu
Bin Chen
Ping Luo
Shutao Xia
Yixiao Ge
AI4TS
213
9
0
30 Sep 2022
AdaFocusV3: On Unified Spatial-temporal Dynamic Video Recognition
AdaFocusV3: On Unified Spatial-temporal Dynamic Video RecognitionEuropean Conference on Computer Vision (ECCV), 2022
Yulin Wang
Yang Yue
Xin-Wen Xu
Ali Hassani
V. Kulikov
Nikita Orlov
Qing Xiao
Humphrey Shi
Gao Huang
269
20
0
27 Sep 2022
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from
  Egocentric Video Data
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from Egocentric Video Data
Yichen Ding
Ziming Zhang
Jun Luo
Xun Zhou
192
3
0
27 Sep 2022
Rethinking Resolution in the Context of Efficient Video Recognition
Rethinking Resolution in the Context of Efficient Video RecognitionNeural Information Processing Systems (NeurIPS), 2022
Chuofan Ma
Qiushan Guo
Yi Jiang
Zehuan Yuan
Ping Luo
Xiaojuan Qi
220
16
0
26 Sep 2022
Multi-modal Video Chapter Generation
Multi-modal Video Chapter Generation
Xiao Cao
Zitan Chen
Canyu Le
Lei Meng
VGen
190
3
0
26 Sep 2022
Mitigating Representation Bias in Action Recognition: Algorithms and
  Benchmarks
Mitigating Representation Bias in Action Recognition: Algorithms and Benchmarks
Haodong Duan
Yue Zhao
Kai-xiang Chen
Yu Xiong
Dahua Lin
115
9
0
20 Sep 2022
MSA-GCN:Multiscale Adaptive Graph Convolution Network for Gait Emotion
  Recognition
MSA-GCN:Multiscale Adaptive Graph Convolution Network for Gait Emotion RecognitionPattern Recognition (Pattern Recogn.), 2022
Yunfei Yin
Li Jing
Faliang Huang
Guangchao Yang
Zhuowei Wang
CVBM
178
29
0
19 Sep 2022
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior
  Understanding in the Industrial-like Domain
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like DomainComputer Vision and Image Understanding (CVIU), 2022
Francesco Ragusa
Antonino Furnari
G. Farinella
EgoV
236
41
0
19 Sep 2022
Action-based Early Autism Diagnosis Using Contrastive Feature Learning
Action-based Early Autism Diagnosis Using Contrastive Feature LearningMultimedia Systems (Multimed. Syst.), 2022
Asha Rani
Pankaj Yadav
Yashaswi Verma
210
5
0
12 Sep 2022
Graphing the Future: Activity and Next Active Object Prediction using
  Graph-based Activity Representations
Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity RepresentationsInternational Symposium on Visual Computing (ISVC), 2022
Victoria Manousaki
K. Papoutsakis
Antonis Argyros
145
4
0
12 Sep 2022
Predicting the Next Action by Modeling the Abstract Goal
Predicting the Next Action by Modeling the Abstract GoalInternational Conference on Pattern Recognition (ICPR), 2022
Debaditya Roy
Basura Fernando
EgoV
368
22
0
12 Sep 2022
MAiVAR: Multimodal Audio-Image and Video Action Recognizer
MAiVAR: Multimodal Audio-Image and Video Action RecognizerVisual Communications and Image Processing (VCIP), 2022
Muhammad Bilal Shaikh
Douglas Chai
S. Islam
Naveed Akhtar
160
6
0
11 Sep 2022
An Empirical Study of End-to-End Video-Language Transformers with Masked
  Visual Modeling
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual ModelingComputer Vision and Pattern Recognition (CVPR), 2022
Tsu-Jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
William Yang Wang
Lijuan Wang
Zicheng Liu
VLM
633
83
0
04 Sep 2022
Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action
  Recognition
Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action RecognitionEuropean Conference on Computer Vision (ECCV), 2022
Tianjiao Li
Lin Geng Foo
Qiuhong Ke
Hossein Rahmani
Anran Wang
Jinghua Wang
Jing Liu
218
30
0
03 Sep 2022
Attentive pooling for Group Activity Recognition
Attentive pooling for Group Activity Recognition
Ding Li
Yuan Xie
Wensheng Zhang
Yongqiang Tang
Zhizhong Zhang
185
0
0
31 Aug 2022
A Circular Window-based Cascade Transformer for Online Action Detection
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
192
6
0
30 Aug 2022
Actor-identified Spatiotemporal Action Detection -- Detecting Who Is
  Doing What in Videos
Actor-identified Spatiotemporal Action Detection -- Detecting Who Is Doing What in Videos
Fan Yang
Norimichi Ukita
S. Sakti
Satoshi Nakamura
205
0
0
27 Aug 2022
Adaptive Perception Transformer for Temporal Action Localization
Adaptive Perception Transformer for Temporal Action Localization
Yizheng Ouyang
Tianjin Zhang
Weibo Gu
Hongfa Wang
240
3
0
25 Aug 2022
Modality Mixer for Multi-modal Action Recognition
Modality Mixer for Multi-modal Action RecognitionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Sumin Lee
Sangmin Woo
Yeonju Park
Muhammad Adi Nugroho
Changick Kim
177
12
0
24 Aug 2022
Hierarchically Decomposed Graph Convolutional Networks for
  Skeleton-Based Action Recognition
Hierarchically Decomposed Graph Convolutional Networks for Skeleton-Based Action RecognitionIEEE International Conference on Computer Vision (ICCV), 2022
Junghoon Lee
Minhyeok Lee
Dogyoon Lee
Sangyoon Lee
BDL
282
213
0
23 Aug 2022
Hierarchical Compositional Representations for Few-shot Action
  Recognition
Hierarchical Compositional Representations for Few-shot Action RecognitionComputer Vision and Image Understanding (CVIU), 2022
Chang-bo Li
Jie Zhang
Shuzhe Wu
Xin Jin
Shiguang Shan
269
26
0
19 Aug 2022
Spatial Temporal Graph Attention Network for Skeleton-Based Action
  Recognition
Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition
Lianyu Hu
Sheng Liu
Wei Feng
ViT
175
12
0
18 Aug 2022
Progressive Cross-modal Knowledge Distillation for Human Action
  Recognition
Progressive Cross-modal Knowledge Distillation for Human Action RecognitionACM Multimedia (ACM MM), 2022
Jianyuan Ni
A. Ngu
Yan Yan
HAI
204
33
0
17 Aug 2022
UAV-CROWD: Violent and non-violent crowd activity simulator from the
  perspective of UAV
UAV-CROWD: Violent and non-violent crowd activity simulator from the perspective of UAV
Mahieyin Rahmun
Tonmoay Deb
Shahriar Ali Bijoy
M. Raha
119
2
0
13 Aug 2022
Sports Video Analysis on Large-Scale Data
Sports Video Analysis on Large-Scale DataEuropean Conference on Computer Vision (ECCV), 2022
Dekun Wu
Henghui Zhao
Xingce Bao
Richard P. Wildes
146
23
0
09 Aug 2022
BabyNet: A Lightweight Network for Infant Reaching Action Recognition in
  Unconstrained Environments to Support Future Pediatric Rehabilitation
  Applications
BabyNet: A Lightweight Network for Infant Reaching Action Recognition in Unconstrained Environments to Support Future Pediatric Rehabilitation ApplicationsIEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2021
Amel Dechemi
Vikarn Bhakri
Ipsita Sahin
Arjun Modi
Julya Mestas
Pamodya Peiris
Dannya Enriquez Barrundia
Elena Kokkoni
Konstantinos Karydis
167
10
0
09 Aug 2022
Video-based Human Action Recognition using Deep Learning: A Review
Video-based Human Action Recognition using Deep Learning: A Review
Hieu H. Pham
L. Khoudour
Alain Crouzil
Pablo Zegers
S. Velastín
174
43
0
07 Aug 2022
Frozen CLIP Models are Efficient Video Learners
Frozen CLIP Models are Efficient Video LearnersEuropean Conference on Computer Vision (ECCV), 2022
Ziyi Lin
Shijie Geng
Renrui Zhang
Shiyang Feng
Gerard de Melo
Xiaogang Wang
Jifeng Dai
Yu Qiao
Jiaming Song
CLIPVLM
260
254
0
06 Aug 2022
Expanding Language-Image Pretrained Models for General Video Recognition
Expanding Language-Image Pretrained Models for General Video RecognitionEuropean Conference on Computer Vision (ECCV), 2022
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLMCLIPViT
337
433
0
04 Aug 2022
Uncertainty-Driven Action Quality Assessment
Uncertainty-Driven Action Quality Assessment
Caixia Zhou
Yaping Huang
330
13
0
29 Jul 2022
Spatiotemporal Self-attention Modeling with Temporal Patch Shift for
  Action Recognition
Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action RecognitionEuropean Conference on Computer Vision (ECCV), 2022
Wangmeng Xiang
Chong Li
Biao Wang
Xihan Wei
Xiangpei Hua
Lei Zhang
ViT
161
43
0
27 Jul 2022
Bodily Behaviors in Social Interaction: Novel Annotations and
  State-of-the-Art Evaluation
Bodily Behaviors in Social Interaction: Novel Annotations and State-of-the-Art EvaluationACM Multimedia (ACM MM), 2022
Michal Balazia
Philippe Muller
Ákos Levente Tánczos
A. V. Liechtenstein
Franccois Brémond
285
35
0
26 Jul 2022
P2ANet: A Dataset and Benchmark for Dense Action Detection from Table
  Tennis Match Broadcasting Videos
P2ANet: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos
Jiang Bian
Xuhong Li
Tao Wang
Qingzhong Wang
Jun Huang
Chen Liu
Jun Zhao
Feixiang Lu
Dejing Dou
Haoyi Xiong
195
18
0
26 Jul 2022
Cross-Modal Causal Relational Reasoning for Event-Level Visual Question
  Answering
Cross-Modal Causal Relational Reasoning for Event-Level Visual Question AnsweringIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yang Liu
Guanbin Li
Guanbin Li
LRM
575
148
0
26 Jul 2022
MAR: Masked Autoencoders for Efficient Action Recognition
MAR: Masked Autoencoders for Efficient Action RecognitionIEEE transactions on multimedia (IEEE TMM), 2022
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Xiang Wang
Yuehuang Wang
Yiliang Lv
Changxin Gao
Nong Sang
248
59
0
24 Jul 2022
EgoEnv: Human-centric environment representations from egocentric video
EgoEnv: Human-centric environment representations from egocentric videoNeural Information Processing Systems (NeurIPS), 2022
Tushar Nagarajan
Santhosh Kumar Ramakrishnan
Ruta Desai
James M. Hillis
Kristen Grauman
EgoV
311
25
0
22 Jul 2022
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition
NSNet: Non-saliency Suppression Sampler for Efficient Video RecognitionEuropean Conference on Computer Vision (ECCV), 2022
Boyang Xia
Wenhao Wu
Haoran Wang
Rui Su
Dongliang He
Haosen Yang
Xiaoran Fan
Wanli Ouyang
230
24
0
21 Jul 2022
Previous
123...8910...272829
Next
Page 9 of 29
Pageof 29