ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.03056
  4. Cited By
VPN: Learning Video-Pose Embedding for Activities of Daily Living

VPN: Learning Video-Pose Embedding for Activities of Daily Living

6 July 2020
Srijan Das
Saurav Sharma
Rui Dai
Francois Bremond
Monique Thonnat
    ViT
ArXiv (abs)PDFHTML

Papers citing "VPN: Learning Video-Pose Embedding for Activities of Daily Living"

50 / 54 papers shown
Heatmap Pooling Network for Action Recognition from RGB Videos
Heatmap Pooling Network for Action Recognition from RGB VideosIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Mengyuan Liu
Jinfu Liu
Yongkang Jiang
Bin He
163
0
0
03 Dec 2025
Probabilistic Temporal Masked Attention for Cross-view Online Action Detection
Probabilistic Temporal Masked Attention for Cross-view Online Action DetectionIEEE transactions on multimedia (TMM), 2025
Liping Xie
Yang Tan
Shicheng Jing
Huimin Lu
Kanjian Zhang
193
2
0
23 Aug 2025
Dual-view Spatio-Temporal Feature Fusion with CNN-Transformer Hybrid Network for Chinese Isolated Sign Language Recognition
Dual-view Spatio-Temporal Feature Fusion with CNN-Transformer Hybrid Network for Chinese Isolated Sign Language Recognition
Siyuan Jing
G. Wang
Haoyang Zhai
Qin Tao
Jun Yang
Bing Wang
Peng Jin
SLR
253
1
0
08 Jun 2025
Geometric Visual Fusion Graph Neural Networks for Multi-Person Human-Object Interaction Recognition in Videos
Geometric Visual Fusion Graph Neural Networks for Multi-Person Human-Object Interaction Recognition in VideosExpert systems with applications (ESWA), 2025
Tanqiu Qiao
Ruochen Li
Frederick W. B. Li
Yoshiki Kubotani
Shigeo Morishima
Hubert P. H. Shum
354
6
0
03 Jun 2025
Just Dance with $π$! A Poly-modal Inductor for Weakly-supervised Video Anomaly Detection
Just Dance with πππ! A Poly-modal Inductor for Weakly-supervised Video Anomaly DetectionComputer Vision and Pattern Recognition (CVPR), 2025
Snehashis Majhi
Giacomo DÁmicantonio
A. Dantcheva
Quan Kong
Lorenzo Garattoni
Gianpiero Francesca
Egor Bondarev
Francois Bremond
284
0
0
19 May 2025
Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?
Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?Computer Vision and Pattern Recognition (CVPR), 2025
Jianyang Xie
Yitian Zhao
Y. Meng
He Zhao
Anh Nguyen
Yalin Zheng
279
5
0
15 May 2025
AM Flow: Adapters for Temporal Processing in Action Recognition
AM Flow: Adapters for Temporal Processing in Action Recognition
Tanay Agrawal
Abid Ali
A. Dantcheva
François Brémond
312
0
0
04 Nov 2024
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
Mohammad Mahdavian
Mohammad Loni
Mo Chen
Mo Chen
376
0
0
02 Oct 2024
Pose-Guided Fine-Grained Sign Language Video Generation
Pose-Guided Fine-Grained Sign Language Video GenerationEuropean Conference on Computer Vision (ECCV), 2024
Tongkai Shi
Lianyu Hu
Fanhua Shang
Jichao Feng
Peidong Liu
Wei Feng
VGenSLRDiffM
361
7
0
25 Sep 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
376
9
0
10 Aug 2024
Multi-Modality Co-Learning for Efficient Skeleton-based Action
  Recognition
Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition
Jinfu Liu
Chong Chen
Mengyuan Liu
616
31
0
22 Jul 2024
Geometric Features Enhanced Human-Object Interaction Detection
Geometric Features Enhanced Human-Object Interaction Detection
Manli Zhu
Edmond S. L. Ho
Shuang Chen
Longzhi Yang
Hubert P. H. Shum
331
7
0
26 Jun 2024
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living
Rajatsubhra Chakraborty
Arkaprava Sinha
Dominick Reilly
Manish Kumar Govind
Pu Wang
Francois Bremond
Srijan Das
Srijan Das
232
2
0
13 Jun 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A
  Survey
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
439
39
0
22 May 2024
HDBN: A Novel Hybrid Dual-branch Network for Robust Skeleton-based
  Action Recognition
HDBN: A Novel Hybrid Dual-branch Network for Robust Skeleton-based Action Recognition
Jinfu Liu
Baiqiao Yin
Jiaying Lin
Jiajun Wen
Yue Li
Mengyuan Liu
332
11
0
24 Apr 2024
VG4D: Vision-Language Model Goes 4D Video Recognition
VG4D: Vision-Language Model Goes 4D Video Recognition
Zhichao Deng
Xiangtai Li
Xia Li
Yunhai Tong
Shen Zhao
Mengyuan Liu
3DPC
241
13
0
17 Apr 2024
On the Utility of 3D Hand Poses for Action Recognition
On the Utility of 3D Hand Poses for Action RecognitionEuropean Conference on Computer Vision (ECCV), 2024
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
317
15
0
14 Mar 2024
MV2MAE: Multi-View Video Masked Autoencoders
MV2MAE: Multi-View Video Masked Autoencoders
Ketul Shah
Robert Crandall
Jie Xu
Peng Zhou
Marian George
Mayank Bansal
Rama Chellappa
339
8
0
29 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Collaboratively Self-supervised Video Representation Learning for Action RecognitionIEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
516
3
0
15 Jan 2024
Explore Human Parsing Modality for Action Recognition
Explore Human Parsing Modality for Action RecognitionCAAI Transactions on Intelligence Technology (CAAI-TIT), 2024
Jinfu Liu
Runwei Ding
Yuhang Wen
Nan Dai
Fanyang Meng
Shen Zhao
Mengyuan Liu
247
17
0
04 Jan 2024
DVANet: Disentangling View and Action Features for Multi-View Action
  Recognition
DVANet: Disentangling View and Action Features for Multi-View Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2023
Nyle Siddiqui
Praveen Tirupattur
Mubarak Shah
ViT
283
38
0
10 Dec 2023
Just Add $π$! Pose Induced Video Transformers for Understanding
  Activities of Daily Living
Just Add πππ! Pose Induced Video Transformers for Understanding Activities of Daily LivingComputer Vision and Pattern Recognition (CVPR), 2023
Dominick Reilly
Srijan Das
ViT
357
31
0
30 Nov 2023
Modality Mixer Exploiting Complementary Information for Multi-modal
  Action Recognition
Modality Mixer Exploiting Complementary Information for Multi-modal Action Recognition
Sumin Lee
Sangmin Woo
Muhammad Adi Nugroho
Changick Kim
286
0
0
21 Nov 2023
ViFiT: Reconstructing Vision Trajectories from IMU and Wi-Fi Fine Time
  Measurements
ViFiT: Reconstructing Vision Trajectories from IMU and Wi-Fi Fine Time Measurements
Bryan Bo Cao
Abrar Alali
Hansi Liu
Nicholas Meegan
Marco Gruteser
Kristin J. Dana
A. Ashok
Shubham Jain
342
1
0
04 Oct 2023
Unified Contrastive Fusion Transformer for Multimodal Human Action
  Recognition
Unified Contrastive Fusion Transformer for Multimodal Human Action Recognition
Kyoung Ok Yang
Junho Koh
Jun-Won Choi
250
0
0
10 Sep 2023
Vision-Based Human Pose Estimation via Deep Learning: A Survey
Vision-Based Human Pose Estimation via Deep Learning: A SurveyIEEE Transactions on Human-Machine Systems (IEEE Trans. Hum.-Mach. Syst.), 2023
Gongjin Lan
Yuehua Wu
Fei Hu
Qi Hao
3DH
334
82
0
26 Aug 2023
Multi-stage Factorized Spatio-Temporal Representation for RGB-D Action
  and Gesture Recognition
Multi-stage Factorized Spatio-Temporal Representation for RGB-D Action and Gesture RecognitionACM Multimedia (ACM MM), 2023
Yujun Ma
Benjia Zhou
Ruili Wang
Pichao Wang
SLR
294
14
0
23 Aug 2023
Integrating Human Parsing and Pose Network for Human Action Recognition
Integrating Human Parsing and Pose Network for Human Action RecognitionCAAI International Conference on Artificial Intelligence (ICCAI), 2023
Runwei Ding
Yuhang Wen
Jinfu Liu
Nan Dai
Fanyang Meng
Mengyuan Liu
3DH
254
10
0
16 Jul 2023
Seeing the Pose in the Pixels: Learning Pose-Aware Representations in
  Vision Transformers
Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers
Dominick Reilly
Vasu Sharma
Srijan Das
ViT
300
4
0
15 Jun 2023
Learning by Aligning 2D Skeleton Sequences and Multi-Modality Fusion
Learning by Aligning 2D Skeleton Sequences and Multi-Modality FusionEuropean Conference on Computer Vision (ECCV), 2023
Quoc-Huy Tran
Muhammad Ahmed
Murad Popattia
M. Hassan
Ahmed Andrey
Konin M. Zeeshan
AI4TS
802
5
0
31 May 2023
Self-Supervised Video Representation Learning via Latent Time Navigation
Self-Supervised Video Representation Learning via Latent Time NavigationAAAI Conference on Artificial Intelligence (AAAI), 2023
Di Yang
Yaohui Wang
Quan Kong
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
SSLAI4TS
316
17
0
10 May 2023
Temporal-Channel Topology Enhanced Network for Skeleton-Based Action
  Recognition
Temporal-Channel Topology Enhanced Network for Skeleton-Based Action RecognitionChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Jinzhao Luo
Lu Zhou
Guibo Zhu
Guojing Ge
Beiying Yang
Jinqiao Wang
241
2
0
25 Feb 2023
Understanding Policy and Technical Aspects of AI-Enabled Smart Video
  Surveillance to Address Public Safety
Understanding Policy and Technical Aspects of AI-Enabled Smart Video Surveillance to Address Public SafetyComputational Urban Science (CUS), 2023
B. R. Ardabili
Armin Danesh Pazho
Ghazal Alinezhad Noghre
Christopher Neff
Sai Datta Bhaskararayuni
Arun K. Ravindran
Shannon Reid
Hamed Tabkhi
336
40
0
08 Feb 2023
Transformers in Action Recognition: A Review on Temporal Modeling
Transformers in Action Recognition: A Review on Temporal Modeling
Elham Shabaninia
Hossein Nezamabadi-pour
Fatemeh Shafizadegan
ViT
301
14
0
29 Dec 2022
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
Cross-Modal Learning with 3D Deformable Attention for Action RecognitionIEEE International Conference on Computer Vision (ICCV), 2022
Sangwon Kim
Dasom Ahn
ByoungChul Ko
ViT3DPC
389
47
0
12 Dec 2022
STAR-Transformer: A Spatio-temporal Cross Attention Transformer for
  Human Action Recognition
STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action RecognitionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Dasom Ahn
Sangwon Kim
H. Hong
ByoungChul Ko
ViT
320
158
0
14 Oct 2022
Modality Mixer for Multi-modal Action Recognition
Modality Mixer for Multi-modal Action RecognitionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Sumin Lee
Sangmin Woo
Yeonju Park
Muhammad Adi Nugroho
Changick Kim
207
13
0
24 Aug 2022
ModSelect: Automatic Modality Selection for Synthetic-to-Real Domain
  Generalization
ModSelect: Automatic Modality Selection for Synthetic-to-Real Domain Generalization
Zdravko Marinov
Alina Roitberg
David Schneider
Rainer Stiefelhagen
350
6
0
19 Aug 2022
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real
  Recognition of Activities of Daily Living
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily LivingIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Zdravko Marinov
David Schneider
Alina Roitberg
Rainer Stiefelhagen
VGen
282
3
0
03 Aug 2022
Geometric Features Informed Multi-person Human-object Interaction
  Recognition in Videos
Geometric Features Informed Multi-person Human-object Interaction Recognition in VideosEuropean Conference on Computer Vision (ECCV), 2022
Tanqiu Qiao
Qianhui Men
Frederick W. B. Li
Yoshiki Kubotani
Shigeo Morishima
Hubert P. H. Shum
227
27
0
19 Jul 2022
Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens
  in 3D Space
Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D SpaceNeural Information Processing Systems (NeurIPS), 2022
Jinghuan Shang
Srijan Das
Michael S. Ryoo
398
15
0
23 Jun 2022
Quantification of Occlusion Handling Capability of a 3D Human Pose
  Estimation Framework
Quantification of Occlusion Handling Capability of a 3D Human Pose Estimation FrameworkIEEE transactions on multimedia (IEEE TMM), 2022
Mehwish Ghafoor
Arif Mahmood
3DH
177
22
0
08 Mar 2022
Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in
  Autonomous Driving
Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in Autonomous Driving
Jingxiao Zheng
X. Shi
Alexander N. Gorban
Junhua Mao
Yang Song
...
Visesh Chari
Andre Cornman
Yin Zhou
Congcong Li
Drago Anguelov
3DH
297
65
0
22 Dec 2021
Cross-modal Manifold Cutmix for Self-supervised Video Representation
  Learning
Cross-modal Manifold Cutmix for Self-supervised Video Representation Learning
Srijan Das
Michael S. Ryoo
SSL
334
1
0
07 Dec 2021
ViewCLR: Learning Self-supervised Video Representation for Unseen
  Viewpoints
ViewCLR: Learning Self-supervised Video Representation for Unseen Viewpoints
Srijan Das
Michael S. Ryoo
SSL
291
34
0
07 Dec 2021
Skeleton-Based Mutually Assisted Interacted Object Localization and
  Human Action Recognition
Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action RecognitionIEEE transactions on multimedia (IEEE Trans. Multimedia), 2021
Liang Xu
Cuiling Lan
Wenjun Zeng
Cewu Lu
297
36
0
28 Oct 2021
Unsupervised View-Invariant Human Posture Representation
Unsupervised View-Invariant Human Posture Representation
Faegheh Sardari
Bjorn Ommer
Majid Mirmehdi
3DH
289
4
0
17 Sep 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action
  Recognition
UNIK: A Unified Framework for Real-world Skeleton-based Action RecognitionBritish Machine Vision Conference (BMVC), 2021
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
259
60
0
19 Jul 2021
Let's Play for Action: Recognizing Activities of Daily Living by
  Learning from Life Simulation Video Games
Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video Games
Alina Roitberg
David Schneider
Aulia Djamal
C. Seibold
Simon Reiß
Rainer Stiefelhagen
301
36
0
12 Jul 2021
VPN++: Rethinking Video-Pose embeddings for understanding Activities of
  Daily Living
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily LivingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Srijan Das
Rui Dai
Di Yang
Francois Bremond
ViT
479
90
0
17 May 2021
12
Next
Page 1 of 2