ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1608.00859
  4. Cited By
Temporal Segment Networks: Towards Good Practices for Deep Action
  Recognition

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

2 August 2016
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
    ViT
ArXiv (abs)PDFHTML

Papers citing "Temporal Segment Networks: Towards Good Practices for Deep Action Recognition"

50 / 1,449 papers shown
EEA: Exploration-Exploitation Agent for Long Video Understanding
EEA: Exploration-Exploitation Agent for Long Video Understanding
Te Yang
Xiangyu Zhu
Bo Wang
Quan Chen
Peng Jiang
Zhen Lei
60
0
0
03 Dec 2025
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Chenshuang Zhang
Kang Zhang
Joon Son Chung
In So Kweon
Junmo Kim
Chengzhi Mao
DiffM
230
0
0
02 Dec 2025
Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
Junwon Lee
Juhan Nam
Jiyoung Lee
DiffMVGen
107
0
0
02 Dec 2025
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision
Chenting Wang
Yuhan Zhu
Yicheng Xu
Jiange Yang
Ziang Yan
Yali Wang
Yi Wang
Limin Wang
VGen
165
0
0
01 Dec 2025
Beyond Real versus Fake Towards Intent-Aware Video Analysis
Beyond Real versus Fake Towards Intent-Aware Video Analysis
Saurabh Atreya
Nabyl Quignon
Baptiste Chopin
Abhijit Das
A. Dantcheva
AAML
80
0
0
27 Nov 2025
ARPGNet: Appearance- and Relation-aware Parallel Graph Attention Fusion Network for Facial Expression Recognition
ARPGNet: Appearance- and Relation-aware Parallel Graph Attention Fusion Network for Facial Expression RecognitionIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2025
Yan Li
Yong Zhao
Xiaohan Xia
Dongmei Jiang
CVBM3DH
142
0
0
27 Nov 2025
Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition
Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition
Baoli Sun
Y. X. R. Wang
Xinzhu Ma
Zhihui Wang
Kun Lu
Zhiyong Wang
190
0
0
26 Nov 2025
Smooth regularization for efficient video recognition
Smooth regularization for efficient video recognition
Gil Goldman
Raja Giryes
Mahadev Satyanarayanan
AI4TS
203
0
0
25 Nov 2025
EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services
EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services
Keshara Weerasinghe
Xueren Ge
Tessa Heick
L. Wijayasingha
Anthony Cortez
Abhishek Satpathy
John A. Stankovic
H. Alemzadeh
194
0
0
13 Nov 2025
Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV
Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV
Wenbo Huang
Jinghui Zhang
Zhenghao Chen
Guang Li
Lei Zhang
Yang Cao
Fang Dong
Takahiro Ogawa
Miki Haseyama
222
0
0
10 Nov 2025
Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition
Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition
Nicholas Babey
Tiffany Gu
Yiheng Li
Cristian Meo
Kevin Zhu
108
0
0
06 Nov 2025
Pose-Aware Multi-Level Motion Parsing for Action Quality Assessment
Pose-Aware Multi-Level Motion Parsing for Action Quality Assessment
Shuaikang Zhu
Yang Yang
Chen Sun
106
0
0
06 Nov 2025
A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential
A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential
Mehdi Sefidgar Dilmaghani
Francis Fowley
Peter Corcoran
132
0
0
05 Nov 2025
FOCUS: Efficient Keyframe Selection for Long Video Understanding
FOCUS: Efficient Keyframe Selection for Long Video Understanding
Zirui Zhu
Hailun Xu
Yang Luo
Yong Liu
Kanchan Sarkar
Zhenheng Yang
Yang You
152
0
0
31 Oct 2025
FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network
FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network
Fangtong Sun
Congyu Li
Ke Yang
Yuchen Pan
Hanwen Yu
Xichuan Zhang
Yiying Li
192
0
0
27 Oct 2025
DGME-T: Directional Grid Motion Encoding for Transformer-Based Historical Camera Movement Classification
DGME-T: Directional Grid Motion Encoding for Transformer-Based Historical Camera Movement Classification
Tingyu Lin
Armin Dadras
Florian Kleber
Robert Sablatnig
VGen
89
0
0
17 Oct 2025
Camera Movement Classification in Historical Footage: A Comparative Study of Deep Video Models
Camera Movement Classification in Historical Footage: A Comparative Study of Deep Video Models
Tingyu Lin
Armin Dadras
Florian Kleber
Robert Sablatnig
56
0
0
16 Oct 2025
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal Modeling
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal ModelingComputer Vision and Image Understanding (CVIU), 2025
Tim J. Schoonbeek
Shao-Hsuan Hung
Dan Lehman
H. Onvlee
Jacek Kustra
Peter H. N. de With
Fons van der Sommen
120
0
0
14 Oct 2025
MSF-Mamba: Motion-aware State Fusion Mamba for Efficient Micro-Gesture Recognition
MSF-Mamba: Motion-aware State Fusion Mamba for Efficient Micro-Gesture Recognition
Deng Li
Jun Shao
Bohao Xing
Rong Gao
Bihan Wen
Heikki Kälviäinen
Xin Liu
Mamba
304
0
0
12 Oct 2025
VA-Adapter: Adapting Ultrasound Foundation Model to Echocardiography Probe Guidance
VA-Adapter: Adapting Ultrasound Foundation Model to Echocardiography Probe Guidance
Teng Wang
Haojun Jiang
Yuxuan Wang
Zhenguo Sun
Shiji Song
Gao Huang
117
0
0
08 Oct 2025
EvoStruggle: A Dataset Capturing the Evolution of Struggle across Activities and Skill Levels
EvoStruggle: A Dataset Capturing the Evolution of Struggle across Activities and Skill Levels
Shijia Feng
Michael Wray
Walterio W. Mayol-Cuevas
130
2
0
01 Oct 2025
POVQA: Preference-Optimized Video Question Answering with Rationales for Data Efficiency
POVQA: Preference-Optimized Video Question Answering with Rationales for Data Efficiency
Ashim Dahal
Ankit Ghimire
Saydul Akbar Murad
Nick Rahimi
142
0
0
01 Oct 2025
Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis
Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis
Sai Varun Kodathala
Rakesh Vunnam
123
0
0
25 Sep 2025
Six Sigma For Neural Networks: Taguchi-based optimization
Six Sigma For Neural Networks: Taguchi-based optimization
Sai Varun Kodathala
104
0
0
22 Sep 2025
The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment
The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment
Deepak Alapatt
J. Eckhoff
Zhiliang Lyu
Yutong Ban
J. Mazellier
...
Pietro Mascagni
Daniel A. Hashimoto
Guy Rosman
O. Meireles
N. Padoy
ELM
120
0
0
21 Sep 2025
ResidualViT for Efficient Temporally Dense Video Encoding
ResidualViT for Efficient Temporally Dense Video Encoding
Mattia Soldan
Fabian Caba Heilbron
Bernard Ghanem
Josef Sivic
Bryan C. Russell
171
0
0
16 Sep 2025
Video Understanding by Design: How Datasets Shape Architectures and Insights
Video Understanding by Design: How Datasets Shape Architectures and Insights
Lei Wang
Piotr Koniusz
Yongsheng Gao
3DVVGenAI4TS
237
0
0
11 Sep 2025
Diffusion-Based Action Recognition Generalizes to Untrained Domains
Diffusion-Based Action Recognition Generalizes to Untrained Domains
Rogério Guimarães
Frank Xiao
Pietro Perona
Markus Marks
269
0
0
10 Sep 2025
Probabilistic Temporal Masked Attention for Cross-view Online Action Detection
Probabilistic Temporal Masked Attention for Cross-view Online Action DetectionIEEE transactions on multimedia (TMM), 2025
Liping Xie
Yang Tan
Shicheng Jing
Huimin Lu
Kanjian Zhang
151
1
0
23 Aug 2025
GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences
GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences
Saihui Hou
Chenye Wang
Wenpeng Lang
Zhengxiang Lan
Yongzhen Huang
CVBM
100
0
0
11 Aug 2025
Localizing Audio-Visual Deepfakes via Hierarchical Boundary Modeling
Localizing Audio-Visual Deepfakes via Hierarchical Boundary Modeling
Xuanjun Chen
Shih-Peng Cheng
Jiawei Du
Lin Zhang
Xiaoxiao Miao
Chung-Che Wang
Haibin Wu
Hung-yi Lee
Jyh-Shing Roger Jang
220
1
0
04 Aug 2025
Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition
Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition
Jihao Gu
Kun Li
Fei Wang
Yanyan Wei
Zhiliang Wu
Hehe Fan
Meng Wang
386
8
0
29 Jul 2025
SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities
SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities
Yasser Ashraf
Ahmed Sharshar
V. Bojkovic
Bin Gu
146
0
0
22 Jul 2025
DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
Xiaoyi Bao
Chenwei Xie
Hao Tang
Tingyu Weng
Xiaofeng Wang
Yun Zheng
Xingang Wang
VGen
147
1
0
21 Jul 2025
LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering
LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering
Xinxin Dong
Baoyun Peng
H. Ma
Y. Wang
Zixuan Dong
Fei Hu
Xiaodong Wang
193
0
0
20 Jul 2025
Multi-Focus Temporal Shifting for Precise Event Spotting in Sports Videos
Multi-Focus Temporal Shifting for Precise Event Spotting in Sports Videos
Hao Xu
Sam Wells
Mohamed Reda Bouadjenek
Richard Dazeley
316
1
0
10 Jul 2025
Effort-Optimized, Accuracy-Driven Labelling and Validation of Test Inputs for DL Systems: A Mixed-Integer Linear Programming Approach
Effort-Optimized, Accuracy-Driven Labelling and Validation of Test Inputs for DL Systems: A Mixed-Integer Linear Programming Approach
Mohammad Hossein Amini
M. Sabetzadeh
S. Nejati
VLM
175
0
0
07 Jul 2025
Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges
Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges
Sanjeda Akter
Ibne Farabi Shihab
Anuj Sharma
VLM
297
2
0
02 Jul 2025
D$^2$ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
D2^22ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
Wenjie Pei
Qizhong Tan
Guangming Lu
Jiandong Tian
Jun Yu
492
3
0
01 Jul 2025
Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature Alignment
Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature AlignmentIEEE Transactions on Image Processing (IEEE TIP), 2025
Kai Zhou
Shuhai Zhang
Zeng You
Jinwu Hu
Mingkui Tan
Fei Liu
235
0
0
01 Jul 2025
ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment
ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment
Amir Aghdam
Vincent Tao Hu
Bjorn Ommer
VLM
287
2
0
28 Jun 2025
An Effective End-to-End Solution for Multimodal Action RecognitionInternational Conference on Pattern Recognition (ICPR), 2025
Songping Wang
Xiantao Hu
Yueming Lyu
Caifeng Shan
231
2
0
11 Jun 2025
Data-Efficient Challenges in Visual Inductive Priors: A Retrospective
Data-Efficient Challenges in Visual Inductive Priors: A Retrospective
Robert-Jan Bruintjes
A. Lengyel
O. Kayhan
Davide Zambrano
Nergis Tomen
Hadi Jamali Rad
Jan van Gemert
VLM
185
0
0
10 Jun 2025
Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Boyu Chen
Siran Chen
Kunchang Li
Qinglin Xu
Yu Qiao
Yali Wang
VOS
370
7
0
09 Jun 2025
Robustness Evaluation for Video Models with Reinforcement Learning
Robustness Evaluation for Video Models with Reinforcement Learning
Ashwin Ramesh Babu
Sajad Mousavi
Vineet Gundecha
Sahand Ghorbanpour
Avisek Naug
Antonio Guillen
Ricardo Luna Gutierrez
Soumyendu Sarkar
AAML
147
0
0
05 Jun 2025
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning
Liyun Zhu
Qixiang Chen
Xi Shen
Xiaodong Cun
AI4TSLRM
226
6
0
29 May 2025
PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion
PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion
Jaehyun Choi
Jiwan Hur
Gyojin Han
Jaemyung Yu
Junmo Kim
VGen
199
0
0
28 May 2025
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection TransformerComputer Vision and Pattern Recognition (CVPR), 2025
Ho-Joong Kim
Y. E. Lee
Jung-Ho Hong
Seong-Whan Lee
351
1
0
09 May 2025
Learning Streaming Video Representation via Multitask Training
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
496
3
0
28 Apr 2025
ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task
ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task
Ahmad Khalil
Mahmoud Khalil
A. Ngom
VLM
296
1
0
20 Apr 2025
1234...272829
Next