ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.12447
  4. Cited By
MOFO: MOtion FOcused Self-Supervision for Video Understanding

MOFO: MOtion FOcused Self-Supervision for Video Understanding

23 August 2023
Mona Ahmadian
Frank Guerin
Andrew Gilbert
ArXivPDFHTML

Papers citing "MOFO: MOtion FOcused Self-Supervision for Video Understanding"

14 / 14 papers shown
Title
FILS: Self-Supervised Video Feature Prediction In Semantic Language
  Space
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
Mona Ahmadian
Frank Guerin
Andrew Gilbert
37
1
0
05 Jun 2024
Bootstrap Masked Visual Modeling via Hard Patches Mining
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
34
5
0
21 Dec 2023
MaskViT: Masked Visual Pre-Training for Video Prediction
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
97
110
0
23 Jun 2022
Cross-Architecture Self-supervised Video Representation Learning
Cross-Architecture Self-supervised Video Representation Learning
Sheng Guo
Zihua Xiong
Yujie Zhong
Limin Wang
Xiaobo Guo
Bing Han
Weilin Huang
SSL
AI4TS
58
24
0
26 May 2022
Guess What Moves: Unsupervised Video and Image Segmentation by
  Anticipating Motion
Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion
Subhabrata Choudhury
Laurynas Karazija
Iro Laina
Andrea Vedaldi
Christian Rupprecht
OCL
VOS
100
39
0
16 May 2022
Omnivore: A Single Model for Many Visual Modalities
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
209
222
0
20 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
225
2,404
0
04 Jan 2021
Self-supervised Co-training for Video Representation Learning
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
198
304
0
19 Oct 2020
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
401
594
0
21 Jul 2020
Symbiotic Attention with Privileged Information for Egocentric Action
  Recognition
Symbiotic Attention with Privileged Information for Egocentric Action Recognition
Xiaohan Wang
Yu Wu
Linchao Zhu
Yi Yang
22
63
0
08 Feb 2020
UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking
UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking
Jonathon Luiten
Idil Esen Zulfikar
Bastian Leibe
VOS
122
62
0
15 Jan 2020
1