ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.02707
  4. Cited By
Video Action Transformer Network

Video Action Transformer Network

6 December 2018
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
    ViT
ArXivPDFHTML

Papers citing "Video Action Transformer Network"

50 / 119 papers shown
Title
Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study
Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study
Tamim Ahmed
Thanassis Rikakis
24
0
0
03 May 2025
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Keyne Oei
Amr Gomaa
Anna Maria Feit
João Belo
26
0
0
06 Sep 2024
A Comprehensive Review of Few-shot Action Recognition
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
63
3
0
20 Jul 2024
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD
Ioanna Ntinou
Enrique Sanchez
Georgios Tzimiropoulos
34
0
0
11 Jun 2024
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
Chanyeon Kim
Jongwoon Park
Hyun-sool Bae
Woo Chang Kim
42
3
0
03 Apr 2024
Modality Mixer Exploiting Complementary Information for Multi-modal
  Action Recognition
Modality Mixer Exploiting Complementary Information for Multi-modal Action Recognition
Sumin Lee
Sangmin Woo
Muhammad Adi Nugroho
Changick Kim
25
0
0
21 Nov 2023
EgoPCA: A New Framework for Egocentric Hand-Object Interaction
  Understanding
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding
Yue Xu
Yong-Lu Li
Zhemin Huang
Michael Xu Liu
Cewu Lu
Yu-Wing Tai
Chi-Keung Tang
EgoV
20
9
0
05 Sep 2023
UnLoc: A Unified Framework for Video Localization Tasks
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
22
53
0
21 Aug 2023
A Survey on Deep Learning-based Spatio-temporal Action Detection
A Survey on Deep Learning-based Spatio-temporal Action Detection
Peng Wang
Fanwei Zeng
Yu Qian
26
5
0
03 Aug 2023
End-to-End Spatio-Temporal Action Localisation with Video Transformers
End-to-End Spatio-Temporal Action Localisation with Video Transformers
A. Gritsenko
Xuehan Xiong
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
Anurag Arnab
ViT
32
13
0
24 Apr 2023
Efficient Video Action Detection with Token Dropout and Context
  Refinement
Efficient Video Action Detection with Token Dropout and Context Refinement
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
36
14
0
17 Apr 2023
ChiroDiff: Modelling chirographic data with Diffusion Models
ChiroDiff: Modelling chirographic data with Diffusion Models
Ayan Das
Yongxin Yang
Timothy M. Hospedales
Tao Xiang
Yi-Zhe Song
DiffM
24
10
0
07 Apr 2023
DOAD: Decoupled One Stage Action Detection Network
DOAD: Decoupled One Stage Action Detection Network
Shuning Chang
Pichao Wang
Fan Wang
Jiashi Feng
Mike Zheng Show
13
4
0
01 Apr 2023
YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for
  Real-time Spatio-temporal Action Detection
YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection
Jianhua Yang
Kun Dai
ObjD
16
17
0
14 Feb 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
15
25
0
05 Jan 2023
Inductive Attention for Video Action Anticipation
Inductive Attention for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
O. Lanz
31
1
0
17 Dec 2022
MIMO Is All You Need : A Strong Multi-In-Multi-Out Baseline for Video
  Prediction
MIMO Is All You Need : A Strong Multi-In-Multi-Out Baseline for Video Prediction
Shuliang Ning
Mengcheng Lan
Yanran Li
Chaofeng Chen
Qian Chen
Xunlai Chen
Xiaoguang Han
Shuguang Cui
28
20
0
09 Dec 2022
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers
  using Synthetic Scene Data
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
32
16
0
08 Dec 2022
MGFN: Magnitude-Contrastive Glance-and-Focus Network for
  Weakly-Supervised Video Anomaly Detection
MGFN: Magnitude-Contrastive Glance-and-Focus Network for Weakly-Supervised Video Anomaly Detection
Y. Chen
Zhengzhe Liu
Baoheng Zhang
W. Fok
Xiaojuan Qi
Yik-Chung Wu
10
109
0
28 Nov 2022
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal
  Action Localization
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Chen Zhao
Shuming Liu
K. Mangalam
Bernard Ghanem
19
17
0
25 Nov 2022
Discovering A Variety of Objects in Spatio-Temporal Human-Object
  Interactions
Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions
Yong-Lu Li
Hongwei Fan
Zuoyu Qiu
Yiming Dou
Liang Xu
...
Peiyang Guo
Haisheng Su
Dongliang Wang
Wei Yu Wu
Cewu Lu
22
7
0
14 Nov 2022
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory
  Forecasting from Multimodal Data
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal Data
Huy Hoang Nguyen
Matthew B. Blaschko
S. Saarakkala
A. Tiulpin
MedIm
AI4CE
48
15
0
25 Oct 2022
Exploring Self-Attention for Crop-type Classification Explainability
Exploring Self-Attention for Crop-type Classification Explainability
Ivica Obadic
R. Roscher
Dario Augusto Borges Oliveira
Xiao Xiang Zhu
22
7
0
24 Oct 2022
Holistic Interaction Transformer Network for Action Detection
Holistic Interaction Transformer Network for Action Detection
Gueter Josmy Faure
Min-Hung Chen
S. Lai
33
37
0
23 Oct 2022
Rethinking Learning Approaches for Long-Term Action Anticipation
Rethinking Learning Approaches for Long-Term Action Anticipation
Megha Nawhal
Akash Abdu Jyothi
Greg Mori
AI4TS
34
26
0
20 Oct 2022
Grounded Video Situation Recognition
Grounded Video Situation Recognition
Zeeshan Khan
C. V. Jawahar
Makarand Tapaswi
22
13
0
19 Oct 2022
STAR-Transformer: A Spatio-temporal Cross Attention Transformer for
  Human Action Recognition
STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition
Dasom Ahn
Sangwon Kim
H. Hong
ByoungChul Ko
ViT
26
96
0
14 Oct 2022
On the Learning Mechanisms in Physical Reasoning
On the Learning Mechanisms in Physical Reasoning
Shiqian Li
Ke Wu
Chi Zhang
Yixin Zhu
AI4CE
44
13
0
05 Oct 2022
Towards Parameter-Efficient Integration of Pre-Trained Language Models
  In Temporal Video Grounding
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
Erica K. Shimomoto
Edison Marrese-Taylor
Hiroya Takamura
Ichiro Kobayashi
Hideki Nakayama
Yusuke Miyao
21
7
0
26 Sep 2022
Vision Transformers for Action Recognition: A Survey
Vision Transformers for Action Recognition: A Survey
Anwaar Ulhaq
Naveed Akhtar
Ganna Pogrebna
Ajmal Saeed Mian
ViT
19
44
0
13 Sep 2022
Is an Object-Centric Video Representation Beneficial for Transfer?
Is an Object-Centric Video Representation Beneficial for Transfer?
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ViT
31
26
0
20 Jul 2022
Learning Parallax Transformer Network for Stereo Image JPEG Artifacts
  Removal
Learning Parallax Transformer Network for Stereo Image JPEG Artifacts Removal
Xuhao Jiang
Weimin Tan
Ri Cheng
Shili Zhou
Bo Yan
ViT
11
6
0
15 Jul 2022
Beyond Transfer Learning: Co-finetuning for Action Localisation
Beyond Transfer Learning: Co-finetuning for Action Localisation
Anurag Arnab
Xuehan Xiong
A. Gritsenko
Rob Romijnders
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
25
8
0
08 Jul 2022
OmniMAE: Single Model Masked Pretraining on Images and Videos
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar
Alaaeldin El-Nouby
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
ViT
27
97
0
16 Jun 2022
Seeing the forest and the tree: Building representations of both
  individual and collective dynamics with transformers
Seeing the forest and the tree: Building representations of both individual and collective dynamics with transformers
Ran Liu
Mehdi Azabou
M. Dabagia
Jingyun Xiao
Eva L. Dyer
AI4CE
27
19
0
10 Jun 2022
Do we really need temporal convolutions in action segmentation?
Do we really need temporal convolutions in action segmentation?
Dazhao Du
Bing-Huang Su
Yu Li
Zhongang Qi
Lingyu Si
Ying Shan
ViT
21
16
0
26 May 2022
VTP: Volumetric Transformer for Multi-view Multi-person 3D Pose
  Estimation
VTP: Volumetric Transformer for Multi-view Multi-person 3D Pose Estimation
Yuxing Chen
Renshu Gu
Ouhan Huang
Gangyong Jia
3DH
33
11
0
25 May 2022
Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for
  Action Recognition
Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition
Kazuki Omi
Jun Kimata
Toru Tamaki
21
7
0
15 Apr 2022
A Transformer-Based Contrastive Learning Approach for Few-Shot Sign
  Language Recognition
A Transformer-Based Contrastive Learning Approach for Few-Shot Sign Language Recognition
Silvan Ferreira
Esdras Costa
M. Dahia
J. Rocha
SLR
9
1
0
05 Apr 2022
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric
  Videos
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos
Shao-Wei Liu
Subarna Tripathi
Somdeb Majumdar
Xiaolong Wang
EgoV
22
93
0
04 Apr 2022
Give Me Your Attention: Dot-Product Attention Considered Harmful for
  Adversarial Patch Robustness
Give Me Your Attention: Dot-Product Attention Considered Harmful for Adversarial Patch Robustness
Giulio Lovisotto
Nicole Finnie
Mauricio Muñoz
Chaithanya Kumar Mummadi
J. H. Metzen
AAML
ViT
17
32
0
25 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
22
28
0
24 Mar 2022
Point3D: tracking actions as moving points with 3D CNNs
Point3D: tracking actions as moving points with 3D CNNs
Shentong Mo
Jingfei Xia
Xiaoqing Ellen Tan
Bhiksha Raj
3DPC
18
5
0
20 Mar 2022
CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity
  Prediction
CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity Prediction
Zhuoran Song
Yihong Xu
Zhezhi He
Li Jiang
Naifeng Jing
Xiaoyao Liang
ViT
18
39
0
09 Mar 2022
Parallel Training of GRU Networks with a Multi-Grid Solver for Long
  Sequences
Parallel Training of GRU Networks with a Multi-Grid Solver for Long Sequences
G. Moon
E. Cyr
20
5
0
07 Mar 2022
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Yuanzhong Liu
Junsong Yuan
Zhigang Tu
19
58
0
24 Feb 2022
When Did It Happen? Duration-informed Temporal Localization of Narrated
  Actions in Vlogs
When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Oana Ignat
Santiago Castro
Yuhang Zhou
Jiajun Bao
Dandan Shan
Rada Mihalcea
18
3
0
16 Feb 2022
ActionFormer: Localizing Moments of Actions with Transformers
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
23
328
0
16 Feb 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
103
0
16 Jan 2022
Video Joint Modelling Based on Hierarchical Transformer for
  Co-summarization
Video Joint Modelling Based on Hierarchical Transformer for Co-summarization
Haopeng Li
Qiuhong Ke
Mingming Gong
Zhang Rui
ViT
26
22
0
27 Dec 2021
123
Next