Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2210.07503
Cited By
STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
14 October 2022
Dasom Ahn
Sangwon Kim
H. Hong
ByoungChul Ko
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition"
36 / 36 papers shown
Heatmap Pooling Network for Action Recognition from RGB Videos
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Mengyuan Liu
Jinfu Liu
Yongkang Jiang
Bin He
93
0
0
03 Dec 2025
Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition
Baoli Sun
Y. X. R. Wang
Xinzhu Ma
Zhihui Wang
Kun Lu
Zhiyong Wang
191
0
0
26 Nov 2025
T-MASK: Temporal Masking for Probing Foundation Models across Camera Views in Driver Monitoring
Thinesh Thiyakesan Ponbagavathi
Kunyu Peng
Alina Roitberg
204
0
0
22 Aug 2025
CarGait: Cross-Attention based Re-ranking for Gait recognition
Gavriel Habib
Noa Barzilay
O. Shimshi
Rami Ben-Ari
N. Darshan
CVBM
282
1
0
01 Jul 2025
3D Skeleton-Based Action Recognition: A Review
Mengyuan Liu
Hong Liu
Qianshuo Hu
Bin Ren
Junsong Yuan
Jiaying Lin
Jiajun Wen
245
2
0
01 Jun 2025
ADLGen: Synthesizing Symbolic, Event-Triggered Sensor Sequences for Human Activity Modeling
Weihang You
Hanqi Jiang
Zishuai Liu
Zihang Xie
Tianming Liu
Jin Lu
Fei Dou
207
0
0
23 May 2025
Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?
Computer Vision and Pattern Recognition (CVPR), 2025
Jianyang Xie
Yitian Zhao
Y. Meng
He Zhao
Anh Nguyen
Yalin Zheng
254
3
0
15 May 2025
LongDiff: Training-Free Long Video Generation in One Go
Computer Vision and Pattern Recognition (CVPR), 2025
Zhuoling Li
Hossein Rahmani
Qiuhong Ke
Jing Liu
DiffM
VGen
VLM
244
5
0
23 Mar 2025
MoFM: A Large-Scale Human Motion Foundation Model
Mohammadreza Baharani
Ghazal Alinezhad Noghre
Armin Danesh Pazho
Gabriel Maldonado
Hamed Tabkhi
AI4CE
1.1K
2
0
08 Feb 2025
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
Mohammad Mahdavian
Mohammad Loni
Mo Chen
Mo Chen
287
0
0
02 Oct 2024
Pose-Guided Fine-Grained Sign Language Video Generation
European Conference on Computer Vision (ECCV), 2024
Tongkai Shi
Lianyu Hu
Fanhua Shang
Jichao Feng
Peidong Liu
Wei Feng
VGen
SLR
DiffM
336
6
0
25 Sep 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
336
4
0
10 Aug 2024
Pose-guided multi-task video transformer for driver action recognition
Ricardo Pizarro
Roberto Valle
L. Bergasa
J. M. Buenaposada
Luis Baumela
ViT
195
1
0
18 Jul 2024
NODER: Image Sequence Regression Based on Neural Ordinary Differential Equations
Hao Bai
Yi Hong
3DH
MedIm
160
6
0
18 Jul 2024
Expressive Keypoints for Skeleton-based Action Recognition via Skeleton Transformation
Yijie Yang
Jinlu Zhang
Jiaxu Zhang
Zhigang Tu
214
11
0
26 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
294
26
0
02 Jun 2024
Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition
Yihang Dong
Xuhang Chen
Yanyan Shen
Michael Kwok-Po Ng
Tao Qian
Shuqiang Wang
169
12
0
28 May 2024
PitcherNet: Powering the Moneyball Evolution in Baseball Video Analytics
Jerrin Bright
Bavesh Balaji
Yuhao Chen
David A Clausi
John S. Zelek
155
3
0
13 May 2024
VG4D: Vision-Language Model Goes 4D Video Recognition
Zhichao Deng
Xiangtai Li
Xia Li
Yunhai Tong
Shen Zhao
Mengyuan Liu
3DPC
202
11
0
17 Apr 2024
HumMUSS: Human Motion Understanding using State Space Models
Arnab Kumar Mondal
Stefano Alletto
Denis Tome
211
8
0
16 Apr 2024
Skeleton-Based Human Action Recognition with Noisy Labels
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2024
Yi Xu
Kunyu Peng
Di Wen
Ruiping Liu
Junwei Zheng
Yufan Chen
Kailai Li
Alina Roitberg
Kailun Yang
Rainer Stiefelhagen
NoLa
209
14
0
15 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
European Conference on Computer Vision (ECCV), 2024
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
219
13
0
14 Mar 2024
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
Yuke Li
Guangyi Chen
Ben Abramowitz
Stefano Anzellotti
Donglai Wei
TTA
300
3
0
20 Feb 2024
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment
Lei Wang
Jun Liu
Liang Zheng
Tom Gedeon
Piotr Koniusz
273
18
0
07 Feb 2024
SignVTCL: Multi-Modal Continuous Sign Language Recognition Enhanced by Visual-Textual Contrastive Learning
British Machine Vision Conference (BMVC), 2024
Hao Chen
Jiaze Wang
Ziyu Guo
Jinpeng Li
Donghao Zhou
Bian Wu
Chenyong Guan
Guangyong Chen
Pheng-Ann Heng
260
10
0
22 Jan 2024
Explore Human Parsing Modality for Action Recognition
CAAI Transactions on Intelligence Technology (CAAI-TIT), 2024
Jinfu Liu
Runwei Ding
Yuhang Wen
Nan Dai
Fanyang Meng
Shen Zhao
Mengyuan Liu
200
13
0
04 Jan 2024
Just Add
π
π
π
! Pose Induced Video Transformers for Understanding Activities of Daily Living
Computer Vision and Pattern Recognition (CVPR), 2023
Dominick Reilly
Srijan Das
ViT
300
27
0
30 Nov 2023
Context-aware Session-based Recommendation with Graph Neural Networks
Zhihui Zhang
Jianxiang Yu
Xiang Li
218
2
0
14 Oct 2023
Position and Orientation-Aware One-Shot Learning for Medical Action Recognition from Signal Data
IEEE transactions on multimedia (IEEE TMM), 2023
Leiyu Xie
Yuxing Yang
Zeyu Fu
S. M. Naqvi
ViT
338
4
0
27 Sep 2023
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai-Nguyen Nguyen
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
328
22
0
23 Sep 2023
Unified Contrastive Fusion Transformer for Multimodal Human Action Recognition
Kyoung Ok Yang
Junho Koh
Jun-Won Choi
200
0
0
10 Sep 2023
Multi-stage Factorized Spatio-Temporal Representation for RGB-D Action and Gesture Recognition
ACM Multimedia (ACM MM), 2023
Yujun Ma
Benjia Zhou
Ruili Wang
Pichao Wang
SLR
258
13
0
23 Aug 2023
One-Shot Action Recognition via Multi-Scale Spatial-Temporal Skeleton Matching
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Siyuan Yang
Jun Liu
Shijian Lu
Er Meng Hwa
Alex C. Kot
295
18
0
14 Jul 2023
Multi-Dimensional Refinement Graph Convolutional Network with Robust Decouple Loss for Fine-Grained Skeleton-Based Action Recognition
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Shengyuan Liu
Yuanyuan Ding
Jin-Rong Zhang
Kaiyuan Liu
Sihan Zhang
Feilong Wang
Gao Huang
142
3
0
27 Jun 2023
Towards Continual Egocentric Activity Recognition: A Multi-modal Egocentric Activity Dataset for Continual Learning
IEEE transactions on multimedia (IEEE TMM), 2023
Linfeng Xu
Qingbo Wu
Lili Pan
Fanman Meng
Hongliang Li
Chiyuan He
Hanxin Wang
Shaoxu Cheng
Yunshu Dai
EgoV
HAI
165
39
0
26 Jan 2023
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
IEEE International Conference on Computer Vision (ICCV), 2022
Sangwon Kim
Dasom Ahn
ByoungChul Ko
ViT
3DPC
327
44
0
12 Dec 2022
1
Page 1 of 1