v1v2v3v4 (latest)

MaxViT: Multi-Axis Vision Transformer

European Conference on Computer Vision (ECCV), 2022

4 April 2022

Feng Yang

ArXiv (abs)PDF HTML Github (473★)

Papers citing "MaxViT: Multi-Axis Vision Transformer"

50 / 370 papers shown

DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision

166

03 Dec 2025

Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling

175

24 Nov 2025

EVCC: Enhanced Vision Transformer-ConvNeXt-CoAtNet Fusion for Classification

Muhammad Abdullah Adnan

ViT

121

24 Nov 2025

A Spatial Semantics and Continuity Perception Attention for Remote Sensing Water Body Change Detection

145

20 Nov 2025

AdaptViG: Adaptive Vision GNN with Exponential Decay Gating

Mustafa Munir

Md Mostafijur Rahman

R. Marculescu

13 Nov 2025

Hilbert-Guided Sparse Local Attention

Yunge Li

Lanyu Xu

164

08 Nov 2025

MACMD: Multi-dilated Contextual Attention and Channel Mixer Decoding for Medical Image Segmentation

188

08 Nov 2025

Precipitation nowcasting of satellite data using physically-aligned neural networks

192

07 Nov 2025

Attentive Convolution: Unifying the Expressivity of Self-Attention with Convolutional Efficiency

191

23 Oct 2025

Counting Hallucinations in Diffusion Models

341

15 Oct 2025

Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization

265

09 Oct 2025

Universal Neural Architecture Space: Covering ConvNets, Transformers and Everything in Between

Ondřej Týbl

Lukáš Neumann

AI4CE

279

07 Oct 2025

A Comprehensive Review on Artificial Intelligence Empowered Solutions for Enhancing Pedestrian and Cyclist Safety

Muhammad Monjurul Karim

228

30 Sep 2025

Introducing Multimodal Paradigm for Learning Sleep Staging PSG via General-Purpose Model

Jianheng Zhou

128

26 Sep 2025

Frequency-Aware Model Parameter Explorer: A new attribution method for improving explainability

147

25 Sep 2025

Align Where the Words Look: Cross-Attention-Guided Patch Alignment with Contrastive and Transport Regularization for Bengali Captioning

Riad Ahmed Anonto

Sardar Md. Saffat Zabin

M. Saifur Rahman

VLM

188

22 Sep 2025

Multi-Modal Sensing Aided mmWave Beamforming for V2V Communications with Transformers

Muhammad Baqer Mollah

Honggang Wang

Hua Fang

153

14 Sep 2025

CoAtNeXt:An Attention-Enhanced ConvNeXtV2-Transformer Hybrid Model for Gastric Tissue Classification

Mustafa Yurdakul

Şakir Tasdemir

155

11 Sep 2025

Sparse Transformer for Ultra-sparse Sampled Video Compressive Sensing

221

10 Sep 2025

Focus Through Motion: RGB-Event Collaborative Token Sparsification for Efficient Object Detection

155

04 Sep 2025

A Lightweight Convolution and Vision Transformer integrated model with Multi-scale Self-attention Mechanism

196

23 Aug 2025

NAT: Learning to Attack Neurons for Enhanced Adversarial TransferabilityIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

Krishna Kanth Nakka

Alexandre Alahi

AAML

190

23 Aug 2025

A Fully Transformer Based Multimodal Framework for Explainable Cancer Image Segmentation Using Radiology Reports

117

19 Aug 2025

Subjective and Objective Quality Assessment of Banding Artifacts on Compressed VideosIEEE Transactions on Image Processing (IEEE TIP), 2025

263

12 Aug 2025

CMAMRNet: A Contextual Mask-Aware Network Enhancing Mural Restoration Through Comprehensive Mask Guidance

264

10 Aug 2025

Prototype-Driven Structure Synergy Network for Remote Sensing Images SegmentationIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2025

206

06 Aug 2025

Representation Shift: Unifying Token Compression with FlashAttention

248

01 Aug 2025

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

262

29 Jul 2025

EA-ViT: Efficient Adaptation for Elastic Vision Transformer

...

229

25 Jul 2025

GVCCS: A Dataset for Contrail Identification and Tracking on Visible Whole Sky Camera Sequences

Stephania-Denisa Bocu

308

24 Jul 2025

Perceptual Classifiers: Detecting Generative Images using Perceptual Features

Krishna Srikar Durbha

Asvin Kumar Venkataramanan

Rajesh Sureddi

Alan C. Bovik

218

23 Jul 2025

A2Mamba: Attention-augmented State Space Models for Visual Recognition

267

22 Jul 2025

Colorectal Cancer Tumor Grade Segmentation in Digital Histopathology Images: From Giga to Mini Challenge

...

273

07 Jul 2025

Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features

266

26 Jun 2025

Improving Black-Box Generative Attacks via Generator Semantic Consistency

503

23 Jun 2025

Polyline Path Masked Attention for Vision Transformer

389

19 Jun 2025

synth-dacl: Does Synthetic Defect Data Enhance Segmentation Accuracy and Robustness for Real-World Bridge Inspections?

206

17 Jun 2025

Vision Transformers for End-to-End Quark-Gluon Jet Classification from Calorimeter Images

232

17 Jun 2025

Detecção da Psoríase Utilizando Visão Computacional: Uma Abordagem Comparativa Entre CNNs e Vision Transformers

218

11 Jun 2025

CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray

...

226

09 Jun 2025

DermaCon-IN: A Multi-concept Annotated Dermatological Image Dataset of Indian Skin Disorders for Clinical AI Research

...

209

06 Jun 2025

Any-Class Presence Likelihood for Robust Multi-Label Classification with Abundant Negative Data

Dumindu Tissera

Omar Awadallah

Muhammad Umair Danish

Ayan Sadhu

Katarina Grolinger

216

06 Jun 2025

Perfecting Depth: Uncertainty-Aware Enhancement of Metric Depth

390

05 Jun 2025

Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter EraAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Dan Oneaţă

Desmond Elliott

Stella Frank

232

04 Jun 2025

Learning from Noise: Enhancing DNNs for Event-Based Vision through Controlled Noise Injection

M. Kowalczyk

K. Jeziorek

T. Kryjak

288

04 Jun 2025

NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion

329

27 May 2025

GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View SynthesisComputer Vision and Pattern Recognition (CVPR), 2025

270

26 May 2025

Joint Depth and Reflectivity Estimation using Single-Photon LiDAR

Hashan K. Weerasooriya

435

19 May 2025

RainPro-8: An Efficient Deep Learning Model to Estimate Rainfall Probabilities Over 8 Hours

Rafael Pablos-Sarabia

Joachim Nyborg

Morten Birk

Jeppe Liborius Sjørup

Anders Lillevang Vesterholt

Ira Assent

BDL AI4Cl

442

15 May 2025

FoldNet: Learning Generalizable Closed-Loop Policy for Garment Folding via Keypoint-Driven Asset and Demonstration Synthesis

Yuxing Chen

Bowen Xiao

Hongan Wang

430

14 May 2025