Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
1608.00859
Cited By

Temporal Segment Networks: Towards Good Practices for Deep Action
Recognition

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

2 August 2016

Zhe Wang

Yu Qiao

Luc Van Gool

ArXiv (abs)PDF HTML

Papers citing "Temporal Segment Networks: Towards Good Practices for Deep Action Recognition"

50 / 1,449 papers shown

EEA: Exploration-Exploitation Agent for Long Video Understanding

EEA: Exploration-Exploitation Agent for Long Video Understanding

64

0

0

03 Dec 2025

Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision

Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision

Chenshuang Zhang

235

0

0

02 Dec 2025

Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

109

0

0

02 Dec 2025

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

170

0

0

01 Dec 2025

Beyond Real versus Fake Towards Intent-Aware Video Analysis

Beyond Real versus Fake Towards Intent-Aware Video Analysis

Baptiste Chopin

84

0

0

27 Nov 2025

ARPGNet: Appearance- and Relation-aware Parallel Graph Attention Fusion Network for Facial Expression Recognition

ARPGNet: Appearance- and Relation-aware Parallel Graph Attention Fusion Network for Facial Expression RecognitionIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2025

149

0

0

27 Nov 2025

Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition

Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition

190

0

0

26 Nov 2025

Smooth regularization for efficient video recognition

Smooth regularization for efficient video recognition

Mahadev Satyanarayanan

218

0

0

25 Nov 2025

EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services

EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services

Keshara Weerasinghe

L. Wijayasingha

Abhishek Satpathy

John A. Stankovic

199

0

0

13 Nov 2025

Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV

Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV

225

0

0

10 Nov 2025

Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition

Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition

115

0

0

06 Nov 2025

Pose-Aware Multi-Level Motion Parsing for Action Quality Assessment

Pose-Aware Multi-Level Motion Parsing for Action Quality Assessment

118

0

0

06 Nov 2025

A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential

A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential

Mehdi Sefidgar Dilmaghani

137

0

0

05 Nov 2025

FOCUS: Efficient Keyframe Selection for Long Video Understanding

FOCUS: Efficient Keyframe Selection for Long Video Understanding

159

0

0

31 Oct 2025

FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network

FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network

195

0

0

27 Oct 2025

DGME-T: Directional Grid Motion Encoding for Transformer-Based Historical Camera Movement Classification

DGME-T: Directional Grid Motion Encoding for Transformer-Based Historical Camera Movement Classification

Robert Sablatnig

89

0

0

17 Oct 2025

Camera Movement Classification in Historical Footage: A Comparative Study of Deep Video Models

Camera Movement Classification in Historical Footage: A Comparative Study of Deep Video Models

Robert Sablatnig

57

0

0

16 Oct 2025

Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal Modeling

Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal ModelingComputer Vision and Image Understanding (CVIU), 2025

Tim J. Schoonbeek

Shao-Hsuan Hung

Peter H. N. de With

Fons van der Sommen

121

0

0

14 Oct 2025

MSF-Mamba: Motion-aware State Fusion Mamba for Efficient Micro-Gesture Recognition

MSF-Mamba: Motion-aware State Fusion Mamba for Efficient Micro-Gesture Recognition

Heikki Kälviäinen

307

0

0

12 Oct 2025

VA-Adapter: Adapting Ultrasound Foundation Model to Echocardiography Probe Guidance

VA-Adapter: Adapting Ultrasound Foundation Model to Echocardiography Probe Guidance

123

0

0

08 Oct 2025

EvoStruggle: A Dataset Capturing the Evolution of Struggle across Activities and Skill Levels

EvoStruggle: A Dataset Capturing the Evolution of Struggle across Activities and Skill Levels

Walterio W. Mayol-Cuevas

141

2

0

01 Oct 2025

POVQA: Preference-Optimized Video Question Answering with Rationales for Data Efficiency

POVQA: Preference-Optimized Video Question Answering with Rationales for Data Efficiency

Saydul Akbar Murad

147

0

0

01 Oct 2025

Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis

Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis

Sai Varun Kodathala

129

0

0

25 Sep 2025

Six Sigma For Neural Networks: Taguchi-based optimization

Six Sigma For Neural Networks: Taguchi-based optimization

Sai Varun Kodathala

109

0

0

22 Sep 2025

The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment

The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment

...

Pietro Mascagni

Daniel A. Hashimoto

123

1

0

21 Sep 2025

ResidualViT for Efficient Temporally Dense Video Encoding

ResidualViT for Efficient Temporally Dense Video Encoding

Fabian Caba Heilbron

Bryan C. Russell

171

0

0

16 Sep 2025

Video Understanding by Design: How Datasets Shape Architectures and Insights

Video Understanding by Design: How Datasets Shape Architectures and Insights

238

0

0

11 Sep 2025

Diffusion-Based Action Recognition Generalizes to Untrained Domains

Diffusion-Based Action Recognition Generalizes to Untrained Domains

Rogério Guimarães

270

0

0

10 Sep 2025

Probabilistic Temporal Masked Attention for Cross-view Online Action Detection

Probabilistic Temporal Masked Attention for Cross-view Online Action DetectionIEEE transactions on multimedia (TMM), 2025

158

1

0

23 Aug 2025

GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences

GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences

104

0

0

11 Aug 2025

Localizing Audio-Visual Deepfakes via Hierarchical Boundary Modeling

Localizing Audio-Visual Deepfakes via Hierarchical Boundary Modeling

Shih-Peng Cheng

Jyh-Shing Roger Jang

227

1

0

04 Aug 2025

Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition

Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition

391

8

0

29 Jul 2025

SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities

SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities

147

0

0

22 Jul 2025

DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding

DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding

149

1

0

21 Jul 2025

LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering

LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering

193

1

0

20 Jul 2025

Multi-Focus Temporal Shifting for Precise Event Spotting in Sports Videos

Multi-Focus Temporal Shifting for Precise Event Spotting in Sports Videos

Mohamed Reda Bouadjenek

Richard Dazeley

319

1

0

10 Jul 2025

Effort-Optimized, Accuracy-Driven Labelling and Validation of Test Inputs for DL Systems: A Mixed-Integer Linear Programming Approach

Effort-Optimized, Accuracy-Driven Labelling and Validation of Test Inputs for DL Systems: A Mixed-Integer Linear Programming Approach

Mohammad Hossein Amini

184

0

0

07 Jul 2025

Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges

Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges

Ibne Farabi Shihab

303

2

0

02 Jul 2025

D$^2$ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition

^2

ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition

498

3

0

01 Jul 2025

Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature Alignment

Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature AlignmentIEEE Transactions on Image Processing (IEEE TIP), 2025

246

0

0

01 Jul 2025

ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment

ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment

293

2

0

28 Jun 2025

An Effective End-to-End Solution for Multimodal Action RecognitionInternational Conference on Pattern Recognition (ICPR), 2025

236

2

0

11 Jun 2025

Data-Efficient Challenges in Visual Inductive Priors: A Retrospective

Data-Efficient Challenges in Visual Inductive Priors: A Retrospective

Robert-Jan Bruintjes

Davide Zambrano

Hadi Jamali Rad

186

0

0

10 Jun 2025

Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding

Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding

382

7

0

09 Jun 2025

Robustness Evaluation for Video Models with Reinforcement Learning

Robustness Evaluation for Video Models with Reinforcement Learning

Ashwin Ramesh Babu

Vineet Gundecha

Sahand Ghorbanpour

Antonio Guillen

Ricardo Luna Gutierrez

Soumyendu Sarkar

154

0

0

05 Jun 2025

VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning

VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning

227

7

0

29 May 2025

PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion

PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion

205

0

0

28 May 2025

DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer

DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection TransformerComputer Vision and Pattern Recognition (CVPR), 2025

355

1

0

09 May 2025

Learning Streaming Video Representation via Multitask Training

Learning Streaming Video Representation via Multitask Training

503

3

0

28 Apr 2025

ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task

ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task

306

1

0

20 Apr 2025

1 2 3 4...27 28 29