Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.19082
Cited By
VideoMAC: Video Masked Autoencoders Meet ConvNets
29 February 2024
Gensheng Pei
Tao Chen
XiRuo Jiang
Huafeng Liu
Zeren Sun
Yazhou Yao
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VideoMAC: Video Masked Autoencoders Meet ConvNets"
12 / 12 papers shown
Title
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Junzhu Mao
Yang Shen
Jinyang Guo
Yazhou Yao
Xiansheng Hua
ViT
31
0
0
30 Mar 2025
Semi-supervised Semantic Segmentation with Multi-Constraint Consistency Learning
Jianjian Yin
Tao Chen
Gensheng Pei
Yazhou Yao
Liqiang Nie
Xiansheng Hua
47
1
0
23 Mar 2025
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Gensheng Pei
Tao Chen
Yujia Wang
Xinhao Cai
Xiangbo Shu
Tianfei Zhou
Yazhou Yao
VLM
48
1
0
21 Mar 2025
VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
Yunze Liu
Peiran Wu
C. Liang
Junxiao Shen
Limin Wang
Li Yi
Mamba
44
0
0
16 Mar 2025
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
59
6
0
13 Aug 2024
Relating CNN-Transformer Fusion Network for Change Detection
Yuhao Gao
Gensheng Pei
Mengmeng Sheng
Zeren Sun
Tao Chen
Yazhou Yao
ViT
18
0
0
03 Jul 2024
Anti-Collapse Loss for Deep Metric Learning Based on Coding Rate Metric
XiRuo Jiang
Yazhou Yao
Xili Dai
Fumin Shen
Xian-Sheng Hua
Heng-Tao Shen
20
0
0
03 Jul 2024
Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation
Tao Chen
XiRuo Jiang
Gensheng Pei
Zeren Sun
Yucheng Wang
Yazhou Yao
29
8
0
03 Jul 2024
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
94
110
0
23 Jun 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
1