ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.00935
  4. Cited By
Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition

Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition

IEEE Signal Processing Letters (IEEE SPL), 2025
3 January 2025
Mallika Garg
Debashis Ghosh
P. M. Pradhan
    SLR
ArXiv (abs)PDFHTMLGithub

Papers citing "Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition"

22 / 22 papers shown
MVTN: A Multiscale Video Transformer Network for Hand Gesture
  Recognition
MVTN: A Multiscale Video Transformer Network for Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
ViT
330
2
0
05 Sep 2024
A Methodological and Structural Review of Hand Gesture Recognition
  Across Diverse Data Modalities
A Methodological and Structural Review of Hand Gesture Recognition Across Diverse Data ModalitiesIEEE Access (IEEE Access), 2024
Jungpil Shin
Abu Saleh Musa Miah
Md. Humaun Kabir
M. Rahim
Abdullah Al Shiam
285
52
0
10 Aug 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic
  Hand Gesture Recognition
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLRViT
360
21
0
18 May 2024
End-to-end Video Gaze Estimation via Capturing Head-face-eye
  Spatial-temporal Interaction Context
End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction ContextIEEE Signal Processing Letters (IEEE SPL), 2023
Yiran Guan
Zhuoguang Chen
Wenzheng Zeng
Zhiguo Cao
Yang Xiao
CVBM
426
22
0
27 Oct 2023
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
587
882
0
02 Dec 2021
Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition
Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture RecognitionIEEE Sensors Journal (IEEE Sens. J.), 2021
Dinghao Fan
Hengjie Lu
Shugong Xu
Shan Cao
258
22
0
29 Oct 2021
Multiscale Vision Transformers
Multiscale Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
595
1,578
0
22 Apr 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image
  Classification
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image ClassificationIEEE International Conference on Computer Vision (ICCV), 2021
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
472
2,043
0
27 Mar 2021
Video Transformer Network
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
1.3K
486
0
01 Feb 2021
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attentionInternational Conference on Machine Learning (ICML), 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Edouard Grave
ViT
783
8,831
0
23 Dec 2020
Multi-modal Fusion for Single-Stage Continuous Gesture Recognition
Multi-modal Fusion for Single-Stage Continuous Gesture Recognition
Harshala Gammulle
Akila Pemasiri
Sridha Sridharan
Clinton Fookes
SLR
388
40
0
10 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
1.6K
60,084
0
22 Oct 2020
Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for
  Gesture Recognition
Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition
Zitong Yu
Benjia Zhou
Jun Wan
Pichao Wang
Zhaodong Sun
Xin Liu
Stan Z. Li
Guoying Zhao
3DPC
285
114
0
21 Aug 2020
Res3ATN -- Deep 3D Residual Attention Network for Hand Gesture
  Recognition in Videos
Res3ATN -- Deep 3D Residual Attention Network for Hand Gesture Recognition in VideosInternational Conference on 3D Vision (3DV), 2019
Naina Dhingra
A. Kunz
3DPCSLR
326
42
0
04 Jan 2020
Real-time Hand Gesture Detection and Classification Using Convolutional
  Neural Networks
Real-time Hand Gesture Detection and Classification Using Convolutional Neural NetworksIEEE International Conference on Automatic Face & Gesture Recognition (FG), 2019
Okan Kopuklu
Ahmet Gunduz
Neslihan Köse
Gerhard Rigoll
546
235
0
29 Jan 2019
Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training
Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training
Mahdi Abavisani
Hamid Reza Vaezi Joze
Vishal M. Patel
306
157
0
14 Dec 2018
Motion Fused Frames: Data Level Fusion Strategy for Hand Gesture
  Recognition
Motion Fused Frames: Data Level Fusion Strategy for Hand Gesture Recognition
Okan Kopuklu
Neslihan Köse
Gerhard Rigoll
334
122
0
19 Apr 2018
Exploiting Recurrent Neural Networks and Leap Motion Controller for Sign
  Language and Semaphoric Gesture Recognition
Exploiting Recurrent Neural Networks and Leap Motion Controller for Sign Language and Semaphoric Gesture Recognition
D. Avola
Marco Bernardi
Luigi Cinque
G. Foresti
Cristiano Massaroni
SLR
300
164
0
28 Mar 2018
Attention Is All You Need
Attention Is All You NeedNeural Information Processing Systems (NeurIPS), 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
8.3K
172,602
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
940
9,315
0
22 May 2017
A robust and efficient video representation for action recognition
A robust and efficient video representation for action recognition
Heng Wang
Dan Oneaţă
Jakob Verbeek
Cordelia Schmid
240
338
0
21 Apr 2015
Two-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in VideosNeural Information Processing Systems (NeurIPS), 2014
Karen Simonyan
Andrew Zisserman
1.1K
8,116
0
09 Jun 2014
1
Page 1 of 1