Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.02953
Cited By
Temporal Segment Networks for Action Recognition in Videos
8 May 2017
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Temporal Segment Networks for Action Recognition in Videos"
50 / 298 papers shown
Title
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models
Chen Wang
Fei Xia
Wenhao Yu
Tingnan Zhang
Ruohan Zhang
Ce Liu
Li Fei-Fei
Jie Tan
Jacky Liang
33
0
0
17 Apr 2025
F
3
^3
3
Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Zhaoyu Liu
Kan Jiang
Murong Ma
Zhé Hóu
Yun Lin
J. Dong
37
0
0
11 Apr 2025
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Jongseo Lee
Joohyun Chang
Dongho Lee
Jinwoo Choi
53
0
0
30 Mar 2025
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Stefan Stojanov
David Wendt
Seungwoo Kim
R. Venkatesh
Kevin T. Feigelis
Jiajun Wu
Daniel L. K. Yamins
SSL
71
0
0
25 Mar 2025
Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition
Siyuan Yang
Shilin Lu
Shizheng Wang
Meng Hwa Er
Zengwei Zheng
Alex C. Kot
42
0
0
21 Mar 2025
CLAD: Constrained Latent Action Diffusion for Vision-Language Procedure Planning
Lei Shi
Andreas Bulling
DiffM
52
1
0
09 Mar 2025
Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information
Edoardo Bianchi
Oswald Lanz
3DH
68
1
0
06 Mar 2025
SHADE-AD: An LLM-Based Framework for Synthesizing Activity Data of Alzheimer's Patients
Heming Fu
Hongkai Chen
Shan Lin
Guoliang Xing
72
1
0
03 Mar 2025
Online Meta-learning for AutoML in Real-time (OnMAR)
Mia Gerber
Anna Sergeevna Bosman
J. D. Villiers
OffRL
41
0
0
27 Feb 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
74
2
0
24 Jan 2025
Video Quality Assessment for Online Processing: From Spatial to Temporal Sampling
Jiebin Yan
Lei Wu
Yuming Fang
Xuelin Liu
Xue Xia
Weide Liu
107
2
0
13 Jan 2025
EdgeOAR: Real-time Online Action Recognition On Edge Devices
Wei Luo
Deyu Zhang
Ying Tang
Fan Wu
Yaoxue Zhang
72
0
0
02 Dec 2024
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
When Spatial meets Temporal in Action Recognition
H. Chen
Lei Wang
Y. Chen
Tom Gedeon
Piotr Koniusz
97
2
0
22 Nov 2024
AM Flow: Adapters for Temporal Processing in Action Recognition
Tanay Agrawal
Abid Ali
A. Dantcheva
François Brémond
39
0
0
04 Nov 2024
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
Xiangyu Zeng
Kunchang Li
Chenting Wang
Xinhao Li
Tianxiang Jiang
...
Zhengrong Yue
Yi Wang
Yali Wang
Yu Qiao
Limin Wang
MLLM
VLM
AI4TS
71
14
0
25 Oct 2024
An Evaluation of Large Pre-Trained Models for Gesture Recognition using Synthetic Videos
Arun V. Reddy
Ketul Shah
Corban Rivera
William Paul
Celso M. De Melo
Rama Chellappa
SLR
36
0
0
03 Oct 2024
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
Mohammad Mahdavian
Mohammad Loni
Mo Chen
Mo Chen
25
0
0
02 Oct 2024
Data Collection-free Masked Video Modeling
Yuchi Ishikawa
Masayoshi Kondo
Yoshimitsu Aoki
ViT
19
1
0
10 Sep 2024
Enhancing Long Video Understanding via Hierarchical Event-Based Memory
Dingxin Cheng
Mingda Li
Jingyu Liu
Yongxin Guo
Bin Jiang
Qingbin Liu
Xi Chen
Bo Zhao
35
4
0
10 Sep 2024
Text-Guided Video Masked Autoencoder
D. Fan
Jue Wang
Shuai Liao
Zhikang Zhang
Vimal Bhat
Xinyu Li
VGen
30
3
0
01 Aug 2024
Motion Capture from Inertial and Vision Sensors
Xiaodong Chen
Wu Liu
Qian Bao
Xinchen Liu
Quanwei Yang
Ruoli Dai
Tao Mei
50
3
0
23 Jul 2024
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
74
3
0
20 Jul 2024
MMAD: Multi-label Micro-Action Detection in Videos
Kun Li
Pengyu Liu
Pengyu Liu
Guoliang Chen
Zhiliang Wu
Hehe Fan
Meng Wang
42
7
0
07 Jul 2024
TransferAttn: Transferable-guided Attention Is All You Need for Video Domain Adaptation
Andre Sacilotti
Samuel Felipe dos Santos
N. Sebe
Jurandy Almeida
ViT
44
1
0
01 Jul 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
40
7
0
02 Jun 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
34
0
0
28 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan L. Yuille
Cihang Xie
AI4TS
VGen
SSL
56
1
0
24 May 2024
Computer-Vision-Enabled Worker Video Analysis for Motion Amount Quantification
Hari Iyer
Neel Macwan
Shenghan Guo
Heejin Jeong
20
4
0
22 May 2024
BIMM: Brain Inspired Masked Modeling for Video Representation Learning
Zhifan Wan
Jie Zhang
Chang-bo Li
Shiguang Shan
69
0
0
21 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
57
3
0
21 May 2024
Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy
Hoang-Quan Nguyen
Thanh-Dat Truong
Khoa Luu
34
1
0
02 May 2024
Learning text-to-video retrieval from image captioning
Lucas Ventura
Cordelia Schmid
Gül Varol
3DV
44
3
0
26 Apr 2024
CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation
Lianyu Hu
Wei Feng
Liqing Gao
Zekang Liu
Liang Wan
SLR
32
4
0
17 Apr 2024
IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic
Chirag Parikh
Rohit Saluja
C. V. Jawahar
Ravi Kiran Sarvadevabhatla
35
2
0
12 Apr 2024
Guided Masked Self-Distillation Modeling for Distributed Multimedia Sensor Event Analysis
Masahiro Yasuda
Noboru Harada
Yasunori Ohishi
Shoichiro Saito
Akira Nakayama
Nobutaka Ono
36
3
0
12 Apr 2024
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Bo He
Hengduo Li
Young Kyun Jang
Menglin Jia
Xuefei Cao
Ashish Shah
Abhinav Shrivastava
Ser-Nam Lim
MLLM
83
88
0
08 Apr 2024
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
Swati Jindal
Mohit Yadav
Roberto Manduchi
31
5
0
08 Apr 2024
LongVLM: Efficient Long Video Understanding via Large Language Models
Yuetian Weng
Mingfei Han
Haoyu He
Xiaojun Chang
Bohan Zhuang
VLM
68
56
0
04 Apr 2024
Language Model Guided Interpretable Video Action Reasoning
Ning Wang
Guangming Zhu
HS Li
Liang Zhang
Syed Afaq Ali Shah
Mohammed Bennamoun
51
3
0
02 Apr 2024
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Hui Lu
Hu Jian
Ronald Poppe
A. A. Salah
39
1
0
24 Mar 2024
VidLA: Video-Language Alignment at Scale
Mamshad Nayeem Rizve
Fan Fei
Jayakrishnan Unnikrishnan
Son Tran
Benjamin Z. Yao
Belinda Zeng
Mubarak Shah
Trishul Chilimbi
VLM
AI4TS
58
4
0
21 Mar 2024
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Guo Chen
Yifei Huang
Jilan Xu
Baoqi Pei
Zhe Chen
Zhiqi Li
Jiahao Wang
Kunchang Li
Tong Lu
Limin Wang
Mamba
64
73
0
14 Mar 2024
ActionDiffusion: An Action-aware Diffusion Model for Procedure Planning in Instructional Videos
Lei Shi
Paul-Christian Bürkner
Andreas Bulling
DiffM
VGen
39
4
0
13 Mar 2024
Benchmarking Micro-action Recognition: Dataset, Methods, and Applications
Dan Guo
Kun Li
Bin Hu
Yan Zhang
Meng Wang
57
38
0
08 Mar 2024
Fast Low-parameter Video Activity Localization in Collaborative Learning Environments
Venkatesh Jatla
Sravani Teeparthi
Ugesh Egala
Sylvia Celedón-Pattichis
Marios S. Pattichis
19
2
0
02 Mar 2024
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Jianxiong Li
Jinliang Zheng
Yinan Zheng
Liyuan Mao
Xiaoming Hu
...
Jihao Liu
Yu Liu
Jingjing Liu
Ya-Qin Zhang
Xianyuan Zhan
LM&Ro
OffRL
37
8
0
28 Feb 2024
ViSTec: Video Modeling for Sports Technique Recognition and Tactical Analysis
Yuchen He
Zeqing Yuan
Yihong Wu
Liqi Cheng
Dazhen Deng
Yingcai Wu
40
4
0
25 Feb 2024
What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection
Sourabh Vasant Gothe
Vibhav Agarwal
Sourav Ghosh
Jayesh Rajkumar Vachhani
Pranay Kashyap
Barath Raj Kandur
25
2
0
15 Feb 2024
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment
Lei Wang
Jun Liu
Liang Zheng
Tom Gedeon
Piotr Koniusz
30
9
0
07 Feb 2024
1
2
3
4
5
6
Next