ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.14949
  4. Cited By
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

30 May 2022
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
ArXivPDFHTML

Papers citing "HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling"

26 / 26 papers shown
Title
Image Recognition with Online Lightweight Vision Transformer: A Survey
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
36
0
0
06 May 2025
Structured-Noise Masked Modeling for Video, Audio and Beyond
Structured-Noise Masked Modeling for Video, Audio and Beyond
Aritra Bhowmik
Fida Mohammad Thoker
Carlos Hinojosa
Bernard Ghanem
Cees G. M. Snoek
VGen
54
0
0
20 Mar 2025
Personalized Large Vision-Language Models
Personalized Large Vision-Language Models
Chau Pham
Hoang Phan
David Doermann
Yunjie Tian
VLM
39
3
0
23 Dec 2024
GG-SSMs: Graph-Generating State Space Models
GG-SSMs: Graph-Generating State Space Models
Nikola Zubić
Davide Scaramuzza
Mamba
74
1
0
17 Dec 2024
Mamba YOLO: SSMs-Based YOLO For Object Detection
Mamba YOLO: SSMs-Based YOLO For Object Detection
Zeyu Wang
Chen Li
Huiying Xu
Xinzhong Zhu
Mamba
34
13
0
09 Jun 2024
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Qiang Chen
Xiangbo Su
Xinyu Zhang
Jian Wang
Jiahui Chen
...
Shan Zhang
Kun Yao
Errui Ding
Gang Zhang
Jingdong Wang
ViT
29
5
0
05 Jun 2024
ViG: Linear-complexity Visual Sequence Learning with Gated Linear
  Attention
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao
Xinggang Wang
Lianghui Zhu
Qian Zhang
Chang Huang
37
3
0
28 May 2024
Masked Modeling for Self-supervised Representation Learning on Vision
  and Beyond
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun-Xiong Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
21
13
0
31 Dec 2023
Hierarchical Side-Tuning for Vision Transformers
Hierarchical Side-Tuning for Vision Transformers
Weifeng Lin
Ziheng Wu
Wentao Yang
Mingxin Huang
Jun Huang
Lianwen Jin
13
3
0
09 Oct 2023
Spatial Transform Decoupling for Oriented Object Detection
Spatial Transform Decoupling for Oriented Object Detection
Hongtian Yu
Yunjie Tian
QiXiang Ye
Yunfan Liu
14
26
0
21 Aug 2023
Self-Calibrated Cross Attention Network for Few-Shot Segmentation
Self-Calibrated Cross Attention Network for Few-Shot Segmentation
Qianxiong Xu
Wenting Zhao
Guosheng Lin
Cheng Long
10
13
0
18 Aug 2023
Diffusion Models as Masked Autoencoders
Diffusion Models as Masked Autoencoders
Chen Wei
K. Mangalam
Po-Yao (Bernie) Huang
Yanghao Li
Haoqi Fan
Hu Xu
Huiyu Wang
Cihang Xie
Alan Yuille
Christoph Feichtenhofer
DiffM
SyDa
18
47
0
06 Apr 2023
CAE v2: Context Autoencoder with CLIP Target
CAE v2: Context Autoencoder with CLIP Target
Xinyu Zhang
Jiahui Chen
Junkun Yuan
Qiang Chen
Jian Wang
...
Jimin Pi
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
VLM
CLIP
19
24
0
17 Nov 2022
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
9
1
0
03 Nov 2022
SimpleClick: Interactive Image Segmentation with Simple Vision
  Transformers
SimpleClick: Interactive Image Segmentation with Simple Vision Transformers
Qin Liu
Zhenlin Xu
Gedas Bertasius
Marc Niethammer
14
79
0
20 Oct 2022
A Unified View of Masked Image Modeling
A Unified View of Masked Image Modeling
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
VLM
42
35
0
19 Oct 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision
  and Beyond
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
26
70
0
30 Jul 2022
Context Autoencoder for Self-Supervised Representation Learning
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
11
360
0
07 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
7,337
0
11 Nov 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
104
16
0
29 Sep 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
276
1,490
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
262
955
0
27 Jan 2021
1