ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.01697
  4. Cited By
MaxViT: Multi-Axis Vision Transformer

MaxViT: Multi-Axis Vision Transformer

4 April 2022
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
    ViT
ArXivPDFHTML

Papers citing "MaxViT: Multi-Axis Vision Transformer"

38 / 88 papers shown
Title
SwiftFormer: Efficient Additive Attention for Transformer-based
  Real-time Mobile Vision Applications
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming Yang
F. Khan
ViT
35
83
0
27 Mar 2023
Joint Person Identity, Gender and Age Estimation from Hand Images using Deep Multi-Task Representation Learning
Joint Person Identity, Gender and Age Estimation from Hand Images using Deep Multi-Task Representation Learning
N. L. Baisa
CVBM
24
4
0
27 Mar 2023
FastViT: A Fast Hybrid Vision Transformer using Structural
  Reparameterization
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
ViT
26
149
0
24 Mar 2023
DwinFormer: Dual Window Transformers for End-to-End Monocular Depth
  Estimation
DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation
Md Awsafur Rahman
S. Fattah
ViT
MDE
28
4
0
06 Mar 2023
Efficiency 360: Efficient Vision Transformers
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
19
6
0
16 Feb 2023
Efficient Attention via Control Variates
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
19
18
0
09 Feb 2023
Out of Distribution Performance of State of Art Vision Model
Out of Distribution Performance of State of Art Vision Model
Salman Rahman
W. Lee
18
2
0
25 Jan 2023
Rethinking Vision Transformers for MobileNet Size and Speed
Rethinking Vision Transformers for MobileNet Size and Speed
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
21
157
0
15 Dec 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at
  Scale
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
54
673
0
14 Nov 2022
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
13
1
0
03 Nov 2022
State-of-the-art Models for Object Detection in Various Fields of
  Application
State-of-the-art Models for Object Detection in Various Fields of Application
S. A. G. Naqvi
Syed Shahnawaz Ali
ObjD
OOD
22
0
0
01 Nov 2022
Contextual Learning in Fourier Complex Field for VHR Remote Sensing
  Images
Contextual Learning in Fourier Complex Field for VHR Remote Sensing Images
Yan Zhang
Xiyuan Gao
Qingyan Duan
Jiaxu Leng
Xiao Pu
Xinbo Gao
ViT
16
1
0
28 Oct 2022
Domain Adaptive Object Detection for Autonomous Driving under Foggy
  Weather
Domain Adaptive Object Detection for Autonomous Driving under Foggy Weather
Jinlong Li
Runsheng Xu
Jin Ma
Q. Zou
Jiaqi Ma
Hongkai Yu
TTA
29
67
0
27 Oct 2022
MetaFormer Baselines for Vision
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
23
156
0
24 Oct 2022
S2WAT: Image Style Transfer via Hierarchical Vision Transformer using
  Strips Window Attention
S2WAT: Image Style Transfer via Hierarchical Vision Transformer using Strips Window Attention
Chi Zhang
Xiaogang Xu
Lei Wang
Zaiyan Dai
Jun Yang
ViT
22
23
0
22 Oct 2022
Centralized Feature Pyramid for Object Detection
Centralized Feature Pyramid for Object Detection
Yu Quan
Dong Zhang
Liyan Zhang
Jinhui Tang
ObjD
19
143
0
05 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision
  Models
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
24
58
0
04 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
50
105
0
30 Sep 2022
Dilated Neighborhood Attention Transformer
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
23
67
0
29 Sep 2022
Image as a Foreign Language: BEiT Pretraining for All Vision and
  Vision-Language Tasks
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang
Hangbo Bao
Li Dong
Johan Bjorck
Zhiliang Peng
...
Kriti Aggarwal
O. Mohammed
Saksham Singhal
Subhojit Som
Furu Wei
MLLM
VLM
ViT
25
628
0
22 Aug 2022
Transformer Vs. MLP-Mixer: Exponential Expressive Gap For NLP Problems
Transformer Vs. MLP-Mixer: Exponential Expressive Gap For NLP Problems
D. Navon
A. Bronstein
MoE
31
0
0
17 Aug 2022
Dual Vision Transformer
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
141
75
0
11 Jul 2022
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse
  Transformers
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
Runsheng Xu
Zhengzhong Tu
Hao Xiang
Wei Shao
Bolei Zhou
Jiaqi Ma
28
216
0
05 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
28
32
0
19 Jun 2022
Rethinking Generalization in Few-Shot Classification
Rethinking Generalization in Few-Shot Classification
Markus Hiller
Rongkai Ma
Mehrtash Harandi
Tom Drummond
OCL
VLM
17
55
0
15 Jun 2022
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Muning Wen
J. Kuba
Runji Lin
Weinan Zhang
Ying Wen
J. Wang
Yaodong Yang
21
177
0
30 May 2022
Inception Transformer
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
22
187
0
25 May 2022
Pik-Fix: Restoring and Colorizing Old Photos
Pik-Fix: Restoring and Colorizing Old Photos
Runsheng Xu
Zhengzhong Tu
Yuanqi Du
Xiaoyu Dong
Jinlong Li
Zibo Meng
Jiaqi Ma
A. Bovik
Hongkai Yu
27
13
0
04 May 2022
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision
  Transformer
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer
Runsheng Xu
Hao Xiang
Zhengzhong Tu
Xin Xia
Ming-Hsuan Yang
Jiaqi Ma
ViT
101
361
0
20 Mar 2022
MUSIQ: Multi-scale Image Quality Transformer
MUSIQ: Multi-scale Image Quality Transformer
Junjie Ke
Qifei Wang
Yilin Wang
P. Milanfar
Feng Yang
154
622
0
12 Aug 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,592
0
04 May 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,518
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,604
0
24 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
220
510
0
11 Feb 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
225
2,427
0
04 Jan 2021
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,471
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
263
10,196
0
16 Nov 2016
Real-Time Single Image and Video Super-Resolution Using an Efficient
  Sub-Pixel Convolutional Neural Network
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
Wenzhe Shi
Jose Caballero
Ferenc Huszár
J. Totz
Andrew P. Aitken
Rob Bishop
Daniel Rueckert
Zehan Wang
SupR
190
5,163
0
16 Sep 2016
Previous
12