Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.01514
Cited By
Self-supervised Video Transformer
2 December 2021
Kanchana Ranasinghe
Muzammal Naseer
Salman Khan
F. Khan
Michael S. Ryoo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-supervised Video Transformer"
50 / 63 papers shown
Title
A Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning
Akash Kumar
Ashlesha Kumar
Vibhav Vineet
Y. S. Rawat
SSL
105
0
0
08 Apr 2025
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Bernard Ghanem
53
0
0
01 Apr 2025
SIGMA:Sinkhorn-Guided Masked Video Modeling
Mohammadreza Salehi
Michael Dorkenwald
Fida Mohammad Thoker
E. Gavves
Cees G. M. Snoek
Yuki M. Asano
47
3
0
22 Jul 2024
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
...
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
59
23
0
28 Jun 2024
Open-Vocabulary Temporal Action Localization using Multimodal Guidance
Akshita Gupta
Aditya Arora
Sanath Narayan
Salman Khan
F. Khan
Graham W. Taylor
21
3
0
21 Jun 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan L. Yuille
Cihang Xie
AI4TS
VGen
SSL
44
1
0
24 May 2024
EchoPT: A Pretrained Transformer Architecture that Predicts 2D In-Air Sonar Images for Mobile Robotics
Jan Steckel
W. Jansen
Nico Huebel
MDE
25
0
0
21 May 2024
A Survey of Generative Techniques for Spatial-Temporal Data Mining
Qianru Zhang
Haixin Wang
Cheng Long
Liangcai Su
Xingwei He
...
Tailin Wu
Hongzhi Yin
S. Yiu
Qi Tian
Christian S. Jensen
AI4TS
49
7
0
15 May 2024
Understanding Video Transformers via Universal Concept Discovery
M. Kowal
Achal Dave
Rares Ambrus
Adrien Gaidon
Konstantinos G. Derpanis
P. Tokmakov
ViT
27
8
0
19 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie M. Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
52
0
0
15 Jan 2024
SVFAP: Self-supervised Video Facial Affect Perceiver
Licai Sun
Zheng Lian
Kexin Wang
Yu He
Ming Xu
Haiyang Sun
Bin Liu
Jianhua Tao
38
14
0
31 Dec 2023
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision
I. Dave
Simon Jenni
Mubarak Shah
25
7
0
20 Dec 2023
REACT: Recognize Every Action Everywhere All At Once
N. V. R. Chappa
Pha Nguyen
P. Dobbs
Khoa Luu
27
6
0
27 Nov 2023
Multi-entity Video Transformers for Fine-Grained Video Representation Learning
Matthew Walmer
Rose Kanjirathinkal
Kai Sheng Tai
Keyur Muzumdar
Taipeng Tian
Abhinav Shrivastava
ViT
11
0
0
17 Nov 2023
CycleCL: Self-supervised Learning for Periodic Videos
Matteo Destro
Michael Gygli
SSL
19
1
0
05 Nov 2023
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders
Srijan Das
Tanmay Jain
Dominick Reilly
P. Balaji
Soumyajit Karmakar
Shyam Marjit
Xiang Li
Abhijit Das
Michael S. Ryoo
22
16
0
31 Oct 2023
Self-Supervised Video Transformers for Isolated Sign Language Recognition
Marcelo Sandoval-Castaneda
Yanhong Li
D. Brentari
Karen Livescu
Gregory Shakhnarovich
SLR
8
1
0
02 Sep 2023
LAC: Latent Action Composition for Skeleton-based Action Segmentation
Di Yang
Yaohui Wang
A. Dantcheva
Quan Kong
Lorenzo Garattoni
Gianpiero Francesca
F. Brémond
27
9
0
28 Aug 2023
Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning
P. Balaji
Abhijit Das
Srijan Das
A. Dantcheva
CVBM
9
4
0
25 Aug 2023
Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations
Mohammadreza Salehi
E. Gavves
Cees G. M. Snoek
Yuki M. Asano
VOS
30
19
0
22 Aug 2023
Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Kanchana Ranasinghe
Michael S. Ryoo
SSL
VLM
18
12
0
20 Jul 2023
Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train
Zhao Wang
Chang Liu
Shaoting Zhang
Qi Dou
MedIm
13
54
0
29 Jun 2023
A Large-Scale Analysis on Self-Supervised Video Representation Learning
Akash Kumar
Ashlesha Kumar
Vibhav Vineet
Y. S. Rawat
SSL
14
3
0
09 Jun 2023
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Qiangchang Wang
Yilong Yin
18
0
0
02 Jun 2023
Modulate Your Spectrum in Self-Supervised Learning
Xi Weng
Yu-Li Ni
Tengwei Song
Jie Luo
Rao Muhammad Anwer
Salman Khan
F. Khan
Lei Huang
32
5
0
26 May 2023
TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale
Ziyun Zeng
Yixiao Ge
Zhan Tong
Xihui Liu
Shutao Xia
Ying Shan
21
9
0
23 May 2023
Self-Supervised Video Representation Learning via Latent Time Navigation
Di Yang
Yaohui Wang
Quan Kong
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
F. Brémond
SSL
AI4TS
33
10
0
10 May 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
F. Khan
M. Shah
VLM
VPVLM
22
73
0
06 Apr 2023
SVT: Supertoken Video Transformer for Efficient Video Understanding
Chen-Ming Pan
Rui Hou
Hanchao Yu
Qifan Wang
Senem Velipasalar
Madian Khabsa
ViT
13
0
0
01 Apr 2023
3Mformer: Multi-order Multi-mode Transformer for Skeletal Action Recognition
Lei Wang
Piotr Koniusz
ViT
21
44
0
25 Mar 2023
Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization
Fida Mohammad Thoker
Hazel Doughty
Cees G. M. Snoek
ViT
30
9
0
20 Mar 2023
SPARTAN: Self-supervised Spatiotemporal Transformers Approach to Group Activity Recognition
N. V. R. Chappa
Pha Nguyen
Alec Nelson
Han-Seok Seo
Xin Li
P. Dobbs
Khoa Luu
ViT
40
15
0
06 Mar 2023
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer
N. H. Phong
B. Ribeiro
19
15
0
17 Feb 2023
Offline-to-Online Knowledge Distillation for Video Instance Segmentation
H. Kim
Seunghun Lee
Sunghoon Im
OffRL
36
3
0
15 Feb 2023
Anatomical Invariance Modeling and Semantic Alignment for Self-supervised Learning in 3D Medical Image Analysis
Yankai Jiang
Ming Sun
Heng Guo
Xiaoyu Bai
K. Yan
Le Lu
Minfeng Xu
MedIm
19
19
0
11 Feb 2023
ResFormer: Scaling ViTs with Multi-Resolution Training
Rui Tian
Zuxuan Wu
Qiuju Dai
Hang-Rui Hu
Yu Qiao
Yu-Gang Jiang
ViT
11
30
0
01 Dec 2022
Spatio-Temporal Crop Aggregation for Video Representation Learning
Sepehr Sameni
Simon Jenni
Paolo Favaro
13
3
0
30 Nov 2022
TransVisDrone: Spatio-Temporal Transformer for Vision-based Drone-to-Drone Detection in Aerial Videos
Tushar Sangam
I. Dave
Waqas Sultani
M. Shah
ViT
AI4TS
14
27
0
16 Oct 2022
How to Train Vision Transformer on Small-scale Datasets?
Hanan Gani
Muzammal Naseer
Mohammad Yaqub
ViT
12
49
0
13 Oct 2022
Masked Motion Encoding for Self-Supervised Video Representation Learning
Xinyu Sun
Peihao Chen
Liang-Chieh Chen
Chan Li
Thomas H. Li
Mingkui Tan
Chuang Gan
27
28
0
12 Oct 2022
It Takes Two: Masked Appearance-Motion Modeling for Self-supervised Video Transformer Pre-training
Yuxin Song
Min Yang
Wenhao Wu
Dongliang He
Fu Li
Jingdong Wang
ViT
92
8
0
11 Oct 2022
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Ziyun Zeng
Yuying Ge
Xihui Liu
Bin Chen
Ping Luo
Shutao Xia
Yixiao Ge
AI4TS
29
8
0
30 Sep 2022
ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos
James Wensel
Hayat Ullah
Arslan Munir
ViT
14
41
0
16 Aug 2022
Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework
Hayat Ullah
Arslan Munir
HAI
11
27
0
09 Aug 2022
Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space
Jinghuan Shang
Srijan Das
Michael S. Ryoo
36
13
0
23 Jun 2022
Self-Supervised Learning for Videos: A Survey
Madeline Chantry Schiappa
Y. S. Rawat
M. Shah
SSL
22
130
0
18 Jun 2022
Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels?
Xiang Li
Jinghuan Shang
Srijan Das
Michael S. Ryoo
SSL
17
31
0
10 Jun 2022
SPAct: Self-supervised Privacy Preservation for Action Recognition
I. Dave
C. L. P. Chen
M. Shah
PICV
11
55
0
29 Mar 2022
Self-supervised Video-centralised Transformer for Video Face Clustering
Yujiang Wang
Mingzhi Dong
Jie Shen
Yi-Si Luo
Yiming Lin
Pingchuan Ma
Stavros Petridis
M. Pantic
ViT
10
3
0
24 Mar 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
1
2
Next