Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

2 August 2016

Limin Wang

Yuanjun Xiong

Zhe Wang

Yu Qiao

Luc Van Gool

Papers citing "Temporal Segment Networks: Towards Good Practices for Deep Action Recognition"

50 / 1,449 papers shown

Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

433

18 Apr 2024

O-TALC: Steps Towards Combating Oversegmentation within Online Action Segmentation

190

10 Apr 2024

An Animation-based Augmentation Approach for Action Recognition from Discontinuous VideoEuropean Conference on Artificial Intelligence (ECAI), 2024

268

10 Apr 2024

TIM: A Time Interval Machine for Audio-Visual Action Recognition

Dima Damen

298

08 Apr 2024

SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos

Tao Wu

Runyu He

Gangshan Wu

Limin Wang

3DH

305

06 Apr 2024

Learning Correlation Structures for Vision Transformers

298

05 Apr 2024

LongVLM: Efficient Long Video Understanding via Large Language ModelsEuropean Conference on Computer Vision (ECCV), 2024

Yuetian Weng

Mingfei Han

Haoyu He

Xiaojun Chang

Bohan Zhuang

VLM

372

127

04 Apr 2024

TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate ExpressionComputer Vision and Pattern Recognition (CVPR), 2024

219

03 Apr 2024

ASTRA: An Action Spotting TRAnsformer for Soccer Videos

352

02 Apr 2024

LoSA: Long-Short-range Adapter for Scaling End-to-End Temporal Action Localization

338

01 Apr 2024

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu

Guozhen Zhang

Jing Tan

Gangshan Wu

Limin Wang

250

31 Mar 2024

LLMs are Good Action Recognizers

Haoxuan Qu

Yujun Cai

Jun Liu

301

31 Mar 2024

Hypergraph-based Multi-View Action Recognition using Event Cameras

321

28 Mar 2024

Emotion Recognition from the perspective of Activity Recognition

Savinay Nagendra

Prapti Panigrahi

144

24 Mar 2024

Enhancing Video Transformers for Action Understanding with VLM-aided Training

225

24 Mar 2024

Convection-Diffusion Equation: A Theoretically Certified Framework for Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

248

23 Mar 2024

Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and FusionComputer Vision and Pattern Recognition (CVPR), 2024

257

22 Mar 2024

Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition

233

21 Mar 2024

Intention Action Anticipation Model with Guide-Feedback Loop Mechanism

225

19 Mar 2024

Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes

Wei Tang

230

17 Mar 2024

MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations

568

16 Mar 2024

Don't Judge by the Look: Towards Motion Coherent Video RepresentationInternational Conference on Learning Representations (ICLR), 2024

Huan Wang

259

14 Mar 2024

BID: Boundary-Interior Decoding for Unsupervised Temporal Action Localization Pre-Trainin

182

12 Mar 2024

Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling

W. G. C. Bandara

Vishal M. Patel

VPVLM VLM

260

11 Mar 2024

VideoMamba: State Space Model for Efficient Video UnderstandingEuropean Conference on Computer Vision (ECCV), 2024

Yu Qiao

286

398

11 Mar 2024

Density-Guided Label Smoothing for Temporal Localization of Driving Actions

203

11 Mar 2024

Coherent Temporal Synthesis for Incremental Action SegmentationComputer Vision and Pattern Recognition (CVPR), 2024

Guodong Ding

Hans Golong

Angela Yao

CLL

256

10 Mar 2024

POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View WorldACM Multimedia (ACM MM), 2023

Boshen Xu

Sipeng Zheng

Qin Jin

192

09 Mar 2024

Learning Expressive And Generalizable Motion Features For Face Forgery DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

254

08 Mar 2024

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

Henghui Ding

348

03 Mar 2024

Efficient Action Counting with Dynamic Queries

355

03 Mar 2024

BEE-NET: A deep neural network to identify in-the-wild Bodily Expression of Emotions

Mohammad Mahdi Dehshibi

David Masip

212

21 Feb 2024

LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs

265

21 Feb 2024

Advancing Human Action Recognition with Foundation Models trained on Unlabeled Public Videos

315

14 Feb 2024

Advancing Video Anomaly Detection: A Concise Review and a New Dataset

Tom Gedeon

292

07 Feb 2024

FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action RecognitionInternational Conference on Learning Representations (ICLR), 2024

253

05 Feb 2024

Knowledge Guided Entity-aware Video Captioning and A Basketball Benchmark

207

25 Jan 2024

GTAutoAct: An Automatic Datasets Generation Framework Based on Game Engine Redevelopment for Action Recognition

273

24 Jan 2024

On the Efficacy of Text-Based Input Modalities for Action Anticipation

Apoorva Beedu

Karan Samel

Irfan Essa

403

23 Jan 2024

Deep Learning for Computer Vision based Activity Recognition and Fall Detection of the Elderly: a Systematic Review

F. X. Gaya-Morey

Cristina Manresa-Yee

Jose Maria Buades Rubio

163

22 Jan 2024

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition

305

22 Jan 2024

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2024

Mengmeng Wang

Jun Chen

Guang Dai

Jingdong Wang

Yong-Jin Liu

VLM

208

22 Jan 2024

GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition

432

18 Jan 2024

$Multi-view Distillation based on Multi-modal Fusion for Few-shot Action Recognition(CLIP-$\mathrm{M^2}$DF)$

Multi-view Distillation based on Multi-modal Fusion for Few-shot Action Recognition(CLIP-

\mathrm{M^2}

206

16 Jan 2024

Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification

Wentao Zhu

277

08 Jan 2024

Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification

Wentao Zhu

165

08 Jan 2024

Video Understanding with Large Language Models: A Survey

...

720

170

29 Dec 2023

Video Recognition in Portrait Mode

Mingfei Han

Linjie Yang

Xiaojie Jin

Jiashi Feng

Xiaojun Chang

Heng Wang

229

21 Dec 2023

InstructVideo: Instructing Video Diffusion Models with Human Feedback

266

19 Dec 2023

Deep Learning Approaches for Seizure Video Analysis: A Review

David Ahmedt-Aristizabal

M. Armin

Zeeshan Hayder

Norberto Garcia-Cairasco

355

18 Dec 2023