Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.01697
Cited By
MaxViT: Multi-Axis Vision Transformer
4 April 2022
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MaxViT: Multi-Axis Vision Transformer"
50 / 88 papers shown
Title
EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation
Hanle Zheng
Xujie Han
Zegang Peng
Shangbin Zhang
Guangxun Du
Zhuo Zou
X. Wang
Jibin Wu
Hao Guo
Lei Deng
DiffM
VGen
43
0
0
13 May 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
Qingyuan Wang
Guoxin Wang
B. Cardiff
Deepu John
31
0
0
07 May 2025
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
Xiao Wang
Jong Youl Choi
Takuya Kurihaya
Isaac Lyngaas
Hong-Jun Yoon
...
Dali Wang
Peter Thornton
Prasanna Balaprakash
M. Ashfaq
Dan Lu
28
0
0
07 May 2025
A Spatially-Aware Multiple Instance Learning Framework for Digital Pathology
H. Keshvarikhojasteh
Mihail Tifrea
Sibylle Hess
J. Pluim
M. Veta
49
0
0
24 Apr 2025
Advanced Deep Learning and Large Language Models: Comprehensive Insights for Cancer Detection
Yassine Habchi
Hamza Kheddar
Yassine Himeur
Adel Belouchrani
Erchin Serpedin
Fouad Khelifi
Muhammad E.H. Chowdhury
LM&MA
41
0
0
30 Mar 2025
CADRef: Robust Out-of-Distribution Detection via Class-Aware Decoupled Relative Feature Leveraging
Zhiwei Ling
Yachen Chang
Hailiang Zhao
Xinkui Zhao
Kingsum Chow
Shuiguang Deng
OODD
58
0
0
01 Mar 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
110
1
0
27 Feb 2025
Max360IQ: Blind Omnidirectional Image Quality Assessment with Multi-axis Attention
Jiebin Yan
Ziwen Tan
Yuming Fang
Jiale Rao
Yifan Zuo
41
1
0
26 Feb 2025
Enhancing Vehicle Make and Model Recognition with 3D Attention Modules
Narges Semiromizadeh
Omid Nejati Manzari
S. B. Shokouhi
S. Mirzakuchaki
ViT
86
0
0
24 Feb 2025
QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images
Thien B. Nguyen-Tat
Hoang-An Vo
Phuoc-Sang Dang
65
0
0
17 Feb 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
67
0
0
26 Jan 2025
Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation
Zhengwen Shen
Yulian Li
Han Zhang
Yuchen Weng
Jun Wang
35
0
0
19 Jan 2025
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu
Meng Lou
Yizhou Yu
112
1
0
16 Dec 2024
Frequency-Adaptive Low-Latency Object Detection Using Events and Frames
Haitian Zhang
Xiangyuan Wang
Chang Xu
Xinya Wang
Fang Xu
Huai Yu
Lei Yu
Wen Yang
ObjD
92
0
0
05 Dec 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
33
0
0
12 Nov 2024
S
4
^4
4
ST: A Strong, Self-transferable, faSt, and Simple Scale Transformation for Transferable Targeted Attack
Yongxiang Liu
Bowen Peng
Li Liu
X. Li
72
0
0
13 Oct 2024
Prithvi WxC: Foundation Model for Weather and Climate
J. Schmude
Sujit Roy
Will Trojak
Johannes Jakubik
Daniel Salles Civitarese
...
Campbell Watson
M. Maskey
Tsengdar J Lee
Juan Bernabé-Moreno
Rahul Ramachandran
VLM
AI4Cl
29
10
0
20 Sep 2024
SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance
Shun Zou
Mingya Zhang
Bingjian Fan
Zhengyi Zhou
Xiuguo Zou
Mamba
24
3
0
17 Sep 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
23
4
0
10 Jul 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
33
56
0
10 Jul 2024
Particle Multi-Axis Transformer for Jet Tagging
Muhammad Usman
M. Shahid
Maheen Ejaz
Ummay Hani
Nayab Fatima
Abdul Rehman Khan
Asifullah Khan
Nasir Majid Mirza
21
3
0
09 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
41
6
0
06 Jun 2024
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
36
5
0
22 May 2024
Dynamic Line Rating using Hyper-local Weather Predictions: A Machine Learning Approach
Henri Manninen
Markus Lippus
Georg Rute
22
0
0
20 May 2024
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Jialong Guo
Xinghao Chen
Yehui Tang
Yunhe Wang
ViT
47
9
0
19 May 2024
Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey
Guoping Xu
Xiaxia Wang
Xinglong Wu
Xuesong Leng
Yongchao Xu
3DPC
32
8
0
02 May 2024
Efficient Modulation for Vision Networks
Xu Ma
Xiyang Dai
Jianwei Yang
Bin Xiao
Yinpeng Chen
Yun Fu
Lu Yuan
38
17
0
29 Mar 2024
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang
B. Cardiff
Antoine Frappé
Benoît Larras
Deepu John
29
2
0
26 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
23
15
0
18 Mar 2024
Multi-Human Mesh Recovery with Transformers
Zeyu Wang
Zhenzhen Weng
Serena Yeung-Levy
3DH
27
1
0
26 Feb 2024
CAManim: Animating end-to-end network activation maps
Emily Kaczmarek
Olivier X. Miguel
Alexa C. Bowie
R. Ducharme
Alysha L. J. Dingwall-Harvey
S. Hawken
Christine M. Armour
Mark C. Walker
Kevin Dick
HAI
19
1
0
19 Dec 2023
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
19
1
0
18 Dec 2023
A Novel Image Classification Framework Based on Variational Quantum Algorithms
Yixiong Chen
22
3
0
13 Dec 2023
Kandinsky 3.0 Technical Report
V.Ya. Arkhipkin
Andrei Filatov
Viacheslav Vasilev
Anastasia Maltseva
Said Azizov
Igor Pavlov
Julia Agafonova
Andrey Kuznetsov
Denis Dimitrov
DiffM
25
10
0
06 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
18
0
0
01 Dec 2023
PViT-6D: Overclocking Vision Transformers for 6D Pose Estimation with Confidence-Level Prediction and Pose Tokens
Sebastian Stapf
Tobias Bauernfeind
Marco Riboldi
ViT
20
1
0
29 Nov 2023
LEOD: Label-Efficient Object Detection for Event Cameras
Ziyi Wu
Mathias Gehrig
Qing Lyu
Xudong Liu
Igor Gilitschenski
25
12
0
29 Nov 2023
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
V.Ya. Arkhipkin
Zein Shaheen
Viacheslav Vasilev
E. Dakhova
Andrey Kuznetsov
Denis Dimitrov
DiffM
VGen
16
5
0
22 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
31
35
0
30 Oct 2023
SSG2: A new modelling paradigm for semantic segmentation
F. Diakogiannis
S. Furby
P. Caccetta
Xiaoliang Wu
Rodrigo Ibata
O. Hlinka
John Taylor
VLM
22
0
0
12 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
29
4
0
10 Oct 2023
Enhancing Adversarial Attacks: The Similar Target Method
Shuo Zhang
Ziruo Wang
Zikai Zhou
Huanran Chen
AAML
44
1
0
21 Aug 2023
Dual Aggregation Transformer for Image Super-Resolution
Zheng Chen
Yulun Zhang
Jinjin Gu
L. Kong
Xiaokang Yang
F. I. F. Richard Yu
ViT
11
166
0
07 Aug 2023
M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition
Ji-Hee Moon
Junseok K. Lee
Yu-Ling Lee
Seongsik Park
22
4
0
04 Aug 2023
Performance-optimized deep neural networks are evolving into worse models of inferotemporal visual cortex
Drew Linsley
I. F. Rodriguez
Thomas Fel
Michael Arcaro
Saloni Sharma
Margaret Livingstone
Thomas Serre
22
18
0
06 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
31
27
0
01 Jun 2023
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
Abdul Rehman Khan
Asifullah Khan
ViT
MedIm
34
14
0
15 May 2023
CAD-RADS scoring of coronary CT angiography with Multi-Axis Vision Transformer: a clinically-inspired deep learning pipeline
Alessia Gerbasi
A. Dagliati
Giuseppe Albi
M. Chiesa
D. Andreini
A. Baggiano
S. Mushtaq
G. Pontone
Riccardo Bellazzi
G. Colombo
MedIm
19
4
0
14 Apr 2023
APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud Understanding
Hengjia Li
Tu Zheng
Zhihao Chi
Zheng Yang
Wenxiao Wang
Boxi Wu
Binbin Lin
Deng Cai
3DPC
30
1
0
31 Mar 2023
Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation
Md Mostafijur Rahman
R. Marculescu
MedIm
ViT
19
41
0
29 Mar 2023
1
2
Next