Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.12447
Cited By
MOFO: MOtion FOcused Self-Supervision for Video Understanding
23 August 2023
Mona Ahmadian
Frank Guerin
Andrew Gilbert
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MOFO: MOtion FOcused Self-Supervision for Video Understanding"
14 / 14 papers shown
Title
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
Mona Ahmadian
Frank Guerin
Andrew Gilbert
37
1
0
05 Jun 2024
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
34
5
0
21 Dec 2023
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
94
110
0
23 Jun 2022
Cross-Architecture Self-supervised Video Representation Learning
Sheng Guo
Zihua Xiong
Yujie Zhong
Limin Wang
Xiaobo Guo
Bing Han
Weilin Huang
SSL
AI4TS
58
24
0
26 May 2022
Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion
Subhabrata Choudhury
Laurynas Karazija
Iro Laina
Andrea Vedaldi
Christian Rupprecht
OCL
VOS
100
39
0
16 May 2022
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
209
222
0
20 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
225
2,404
0
04 Jan 2021
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
198
304
0
19 Oct 2020
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
401
594
0
21 Jul 2020
Symbiotic Attention with Privileged Information for Egocentric Action Recognition
Xiaohan Wang
Yu Wu
Linchao Zhu
Yi Yang
22
63
0
08 Feb 2020
UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking
Jonathon Luiten
Idil Esen Zulfikar
Bastian Leibe
VOS
122
62
0
15 Jan 2020
1