ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1608.00859
  4. Cited By
Temporal Segment Networks: Towards Good Practices for Deep Action
  Recognition

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

2 August 2016
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
    ViT
ArXiv (abs)PDFHTML

Papers citing "Temporal Segment Networks: Towards Good Practices for Deep Action Recognition"

50 / 1,449 papers shown
Simultaneous Detection and Interaction Reasoning for Object-Centric
  Action Recognition
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
433
6
0
18 Apr 2024
O-TALC: Steps Towards Combating Oversegmentation within Online Action
  Segmentation
O-TALC: Steps Towards Combating Oversegmentation within Online Action Segmentation
Matthew Kent Myers
Nick Wright
A. Mcgough
Nicholas Martin
190
1
0
10 Apr 2024
An Animation-based Augmentation Approach for Action Recognition from
  Discontinuous Video
An Animation-based Augmentation Approach for Action Recognition from Discontinuous VideoEuropean Conference on Artificial Intelligence (ECAI), 2024
Xingyu Song
Zhan Li
Shi Chen
Xin-Qiang Cai
K. Demachi
268
2
0
10 Apr 2024
TIM: A Time Interval Machine for Audio-Visual Action Recognition
TIM: A Time Interval Machine for Audio-Visual Action Recognition
Jacob Chalk
Jaesung Huh
Evangelos Kazakos
Andrew Zisserman
Dima Damen
298
27
0
08 Apr 2024
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports
  Videos
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos
Tao Wu
Runyu He
Gangshan Wu
Limin Wang
3DH
305
14
0
06 Apr 2024
Learning Correlation Structures for Vision Transformers
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
298
25
0
05 Apr 2024
LongVLM: Efficient Long Video Understanding via Large Language Models
LongVLM: Efficient Long Video Understanding via Large Language ModelsEuropean Conference on Computer Vision (ECCV), 2024
Yuetian Weng
Mingfei Han
Haoyu He
Xiaojun Chang
Bohan Zhuang
VLM
372
127
0
04 Apr 2024
TE-TAD: Towards Full End-to-End Temporal Action Detection via
  Time-Aligned Coordinate Expression
TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate ExpressionComputer Vision and Pattern Recognition (CVPR), 2024
Ho-Joong Kim
Jung-Ho Hong
Heejo Kong
Seong-Whan Lee
219
17
0
03 Apr 2024
ASTRA: An Action Spotting TRAnsformer for Soccer Videos
ASTRA: An Action Spotting TRAnsformer for Soccer Videos
Artur Xarles
Sergio Escalera
T. Moeslund
Albert Clapés
352
15
0
02 Apr 2024
LoSA: Long-Short-range Adapter for Scaling End-to-End Temporal Action
  Localization
LoSA: Long-Short-range Adapter for Scaling End-to-End Temporal Action Localization
Akshita Gupta
Gaurav Mittal
Ahmed Magooda
Ye Yu
Graham W. Taylor
Mei Chen
338
4
0
01 Apr 2024
Dual DETRs for Multi-Label Temporal Action Detection
Dual DETRs for Multi-Label Temporal Action Detection
Yuhan Zhu
Guozhen Zhang
Jing Tan
Gangshan Wu
Limin Wang
250
22
0
31 Mar 2024
LLMs are Good Action Recognizers
LLMs are Good Action Recognizers
Haoxuan Qu
Yujun Cai
Jun Liu
301
42
0
31 Mar 2024
Hypergraph-based Multi-View Action Recognition using Event Cameras
Hypergraph-based Multi-View Action Recognition using Event Cameras
Yue Gao
Jiaxuan Lu
Siqi Li
Yipeng Li
Shaoyi Du
321
26
0
28 Mar 2024
Emotion Recognition from the perspective of Activity Recognition
Emotion Recognition from the perspective of Activity Recognition
Savinay Nagendra
Prapti Panigrahi
144
2
0
24 Mar 2024
Enhancing Video Transformers for Action Understanding with VLM-aided
  Training
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Hui Lu
Hu Jian
Ronald Poppe
A. A. Salah
225
6
0
24 Mar 2024
Convection-Diffusion Equation: A Theoretically Certified Framework for
  Neural Networks
Convection-Diffusion Equation: A Theoretically Certified Framework for Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Tangjun Wang
Chenglong Bao
Zuoqiang Shi
DiffM
248
3
0
23 Mar 2024
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video
  Differentiable AutoAugmentation and Fusion
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and FusionComputer Vision and Pattern Recognition (CVPR), 2024
S. Casarin
C. Ugwu
Sergio Escalera
Oswald Lanz
257
0
0
22 Mar 2024
Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity
  Recognition
Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition
Sumin Lee
Yooseung Wang
Sangmin Woo
Changick Kim
233
2
0
21 Mar 2024
Intention Action Anticipation Model with Guide-Feedback Loop Mechanism
Intention Action Anticipation Model with Guide-Feedback Loop Mechanism
Zongnan Ma
Fuchun Zhang
Zhixiong Nan
Yao Ge
225
5
0
19 Mar 2024
Boosting Semi-Supervised Temporal Action Localization by Learning from
  Non-Target Classes
Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes
Kun Xia
Le Wang
Sanpin Zhou
Gang Hua
Wei Tang
230
3
0
17 Mar 2024
MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent
  Recognition and Out-of-scope Detection in Conversations
MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations
Hanlei Zhang
Xin Wang
Hua Xu
Qianrui Zhou
Kai Gao
Jianhua Su
jinyue Zhao
Wenrui Li
Yanting Chen
568
20
0
16 Mar 2024
Don't Judge by the Look: Towards Motion Coherent Video Representation
Don't Judge by the Look: Towards Motion Coherent Video RepresentationInternational Conference on Learning Representations (ICLR), 2024
Yitian Zhang
Yue Bai
Huan Wang
Yizhou Wang
Yun Fu
259
3
0
14 Mar 2024
BID: Boundary-Interior Decoding for Unsupervised Temporal Action
  Localization Pre-Trainin
BID: Boundary-Interior Decoding for Unsupervised Temporal Action Localization Pre-Trainin
Qihang Fang
Chengcheng Tang
Shugao Ma
Yanchao Yang
182
3
0
12 Mar 2024
Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained
  Models for Spatiotemporal Modeling
Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling
W. G. C. Bandara
Vishal M. Patel
VPVLMVLM
260
3
0
11 Mar 2024
VideoMamba: State Space Model for Efficient Video Understanding
VideoMamba: State Space Model for Efficient Video UnderstandingEuropean Conference on Computer Vision (ECCV), 2024
Kunchang Li
Xinhao Li
Yi Wang
Yinan He
Yali Wang
Limin Wang
Yu Qiao
Mamba
286
398
0
11 Mar 2024
Density-Guided Label Smoothing for Temporal Localization of Driving
  Actions
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Tunç Alkanat
Erkut Akdag
Egor Bondarev
Peter H. N. de With
203
5
0
11 Mar 2024
Coherent Temporal Synthesis for Incremental Action Segmentation
Coherent Temporal Synthesis for Incremental Action SegmentationComputer Vision and Pattern Recognition (CVPR), 2024
Guodong Ding
Hans Golong
Angela Yao
CLL
256
7
0
10 Mar 2024
POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object
  Interaction in the Multi-View World
POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View WorldACM Multimedia (ACM MM), 2023
Boshen Xu
Sipeng Zheng
Qin Jin
192
14
0
09 Mar 2024
Learning Expressive And Generalizable Motion Features For Face Forgery
  Detection
Learning Expressive And Generalizable Motion Features For Face Forgery DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jingyi Zhang
Peng Zhang
Jingjing Wang
Di Xie
Shiliang Pu
254
3
0
08 Mar 2024
Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary
  Action Recognition
Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition
Kun-Yu Lin
Henghui Ding
Jiaming Zhou
Yu-Ming Tang
Yi-Xing Peng
Zhilin Zhao
Chen Change Loy
Wei-Shi Zheng
VLM
348
21
0
03 Mar 2024
Efficient Action Counting with Dynamic Queries
Efficient Action Counting with Dynamic Queries
Zishi Li
Xiaoxuan Ma
Qiuyan Shang
Wentao Zhu
Hai Ci
Yu Qiao
Yizhou Wang
355
4
0
03 Mar 2024
BEE-NET: A deep neural network to identify in-the-wild Bodily Expression
  of Emotions
BEE-NET: A deep neural network to identify in-the-wild Bodily Expression of Emotions
Mohammad Mahdi Dehshibi
David Masip
212
2
0
21 Feb 2024
LLMs Meet Long Video: Advancing Long Video Comprehension with An
  Interactive Visual Adapter in LLMs
LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs
Yunxin Li
Xinyu Chen
Baotain Hu
Min Zhang
265
4
0
21 Feb 2024
Advancing Human Action Recognition with Foundation Models trained on
  Unlabeled Public Videos
Advancing Human Action Recognition with Foundation Models trained on Unlabeled Public Videos
Yang Qian
Yinan Sun
A. Kargarandehkordi
Parnian Azizian
O. Mutlu
Saimourya Surabhi
Pingyi Chen
Zain Jabbar
Dennis Paul Wall
Peter Washington
OffRL
315
5
0
14 Feb 2024
Advancing Video Anomaly Detection: A Concise Review and a New Dataset
Advancing Video Anomaly Detection: A Concise Review and a New Dataset
Liyun Zhu
Lei Wang
Arjun Raj
Tom Gedeon
Chen Chen
292
41
0
07 Feb 2024
FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action
  Recognition
FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action RecognitionInternational Conference on Learning Representations (ICLR), 2024
Xiaohui Huang
Hao Zhou
Kun Yao
Kai Han
VLM
253
48
0
05 Feb 2024
Knowledge Guided Entity-aware Video Captioning and A Basketball
  Benchmark
Knowledge Guided Entity-aware Video Captioning and A Basketball Benchmark
Zeyu Xi
Ge Shi
Xuefen Li
Junchi Yan
Zun Li
Lifang Wu
Zilin Liu
Liang Wang
207
1
0
25 Jan 2024
GTAutoAct: An Automatic Datasets Generation Framework Based on Game
  Engine Redevelopment for Action Recognition
GTAutoAct: An Automatic Datasets Generation Framework Based on Game Engine Redevelopment for Action Recognition
Xingyu Song
Zhan Li
Shi Chen
K. Demachi
273
1
0
24 Jan 2024
On the Efficacy of Text-Based Input Modalities for Action Anticipation
On the Efficacy of Text-Based Input Modalities for Action Anticipation
Apoorva Beedu
Karan Samel
Irfan Essa
403
4
0
23 Jan 2024
Deep Learning for Computer Vision based Activity Recognition and Fall
  Detection of the Elderly: a Systematic Review
Deep Learning for Computer Vision based Activity Recognition and Fall Detection of the Elderly: a Systematic Review
F. X. Gaya-Morey
Cristina Manresa-Yee
Jose Maria Buades Rubio
163
49
0
22 Jan 2024
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot
  Action Recognition
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
Jiaming Zhou
Junwei Liang
Kun-Yu Lin
Jinrui Yang
Wei-Shi Zheng
VLM
305
13
0
22 Jan 2024
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action
  Recognition
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2024
Mengmeng Wang
Jiazheng Xing
Boyuan Jiang
Jun Chen
Jianbiao Mei
Xingxing Zuo
Guang Dai
Jingdong Wang
Yong-Jin Liu
VLM
208
8
0
22 Jan 2024
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot
  Egocentric Action Recognition
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
Guangzhao Dai
Xiangbo Shu
Wenhao Wu
Rui Yan
Jiachao Zhang
VLM
432
9
0
18 Jan 2024
Multi-view Distillation based on Multi-modal Fusion for Few-shot Action
  Recognition(CLIP-$\mathrm{M^2}$DF)
Multi-view Distillation based on Multi-modal Fusion for Few-shot Action Recognition(CLIP-M2\mathrm{M^2}M2DF)
Fei-Yu Guo
YiKang Wang
Han Qi
WenPing Jin
Li Zhu
206
3
0
16 Jan 2024
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video
  Classification
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
277
7
0
08 Jan 2024
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for
  Audio-Video Classification
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
165
5
0
08 Jan 2024
Video Understanding with Large Language Models: A Survey
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Chenliang Xu
Jiebo Luo
Chenliang Xu
VLM
720
170
0
29 Dec 2023
Video Recognition in Portrait Mode
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
229
6
0
21 Dec 2023
InstructVideo: Instructing Video Diffusion Models with Human Feedback
InstructVideo: Instructing Video Diffusion Models with Human Feedback
Hangjie Yuan
Shiwei Zhang
Xiang Wang
Yujie Wei
Tao Feng
Yining Pan
Yingya Zhang
Ziwei Liu
Samuel Albanie
Dong Ni
VGen
266
80
0
19 Dec 2023
Deep Learning Approaches for Seizure Video Analysis: A Review
Deep Learning Approaches for Seizure Video Analysis: A Review
David Ahmedt-Aristizabal
M. Armin
Zeeshan Hayder
Norberto Garcia-Cairasco
Lars Petersson
Clinton Fookes
Akila Pemasiri
A. McGonigal
355
37
0
18 Dec 2023
Previous
12345...272829
Next
Page 4 of 29
Pageof 29