Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2111.09883
Cited By
v1
v2 (latest)
Swin Transformer V2: Scaling Up Capacity and Resolution
18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (14834★)
Papers citing
"Swin Transformer V2: Scaling Up Capacity and Resolution"
50 / 931 papers shown
Title
Not All Splits Are Equal: Rethinking Attribute Generalization Across Unrelated Categories
Liviu Nicolae Fircă
Antonio Bărbălău
Dan Oneata
Elena Burceanu
OOD
VLM
LRM
167
0
0
04 Sep 2025
TinyDrop: Tiny Model Guided Token Dropping for Vision Transformers
Guoxin Wang
Qingyuan Wang
Binhua Huang
Shaowu Chen
Deepu John
VLM
116
0
0
03 Sep 2025
Object Detection with Multimodal Large Vision-Language Models: An In-depth Review
Information Fusion (Inf. Fusion), 2025
Ranjan Sapkota
Manoj Karkee
ObjD
VLM
279
13
0
25 Aug 2025
Expandable Residual Approximation for Knowledge Distillation
IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2025
Zhaoyi Yan
Binghui Chen
Yunfan Liu
Qixiang Ye
CLL
113
0
0
22 Aug 2025
Vision encoders should be image size agnostic and task driven
Nedyalko Prisadnikov
Danda Pani Paudel
Yuqian Fu
Luc Van Gool
88
1
0
22 Aug 2025
Automated Multi-label Classification of Eleven Retinal Diseases: A Benchmark of Modern Architectures and a Meta-Ensemble on a Large Synthetic Dataset
Jerry Cao-Xue
Tien Comlekoglu
Keyi Xue
Guanliang Wang
Jiang Li
Gordon Laurie
OOD
96
0
0
21 Aug 2025
Scalable Event-Based Video Streaming for Machines with MoQ
Mile-High Video Conference (MHV), 2025
Andrew C. Freeman
108
1
0
20 Aug 2025
On the notion of missingness for path attribution explainability methods in medical settings: Guiding the selection of medically meaningful baselines
Alexander Geiger
Lars Wagner
Daniel Rueckert
Dirk Wilhelm
A. Jell
OOD
BDL
MedIm
311
0
0
20 Aug 2025
MedFormer: a data-driven model for forecasting the Mediterranean Sea
Italo Epicoco
D. Donno
Gabriele Accarino
Simone Norberti
Alessandro Grandi
...
Silvio Gualdi
Giovanni Aloisio
Simona Masina
Giulio Boccaletti
Antonio Navarra
AI4Cl
MedIm
128
0
0
16 Aug 2025
Privacy-enhancing Sclera Segmentation Benchmarking Competition: SSBC 2025
Matej Vitek
Darian Tomašević
Abhijit Das
Sabari Nathan
Gökhan Özbulak
...
Raghavendra Ramachandra
Aditya Nigam
Umapada Pal
Peter Peer
Vitomir Štruc
144
0
0
14 Aug 2025
UniConvNet: Expanding Effective Receptive Field while Maintaining Asymptotically Gaussian Distribution for ConvNets of Any Scale
Yuhao Wang
Wei Xi
200
1
0
12 Aug 2025
Revisiting Efficient Semantic Segmentation: Learning Offsets for Better Spatial and Class Feature Alignment
Shi-Chen Zhang
Yunheng Li
Yu-Huan Wu
Qibin Hou
Ming-Ming Cheng
SSeg
188
1
0
12 Aug 2025
CoCAViT: Compact Vision Transformer with Robust Global Coordination
Xuyang Wang
Lingjuan Miao
Zhiqiang Zhou
ViT
VLM
104
0
0
07 Aug 2025
Prototype-Driven Structure Synergy Network for Remote Sensing Images Segmentation
IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2025
Junyi Wang
Jinjiang Li
Guodong Fan
Yakun Ju
Xiang Fang
Alex C. Kot
135
1
0
06 Aug 2025
SolarSeer: Ultrafast and accurate 24-hour solar irradiance forecasts outperforming numerical weather prediction across the USA
Mingliang Bai
Zuliang Fang
Shengyu Tao
Siqi Xiang
Jiang Bian
...
Kit Thambiratnam
Qi Zhang
Hongbin Sun
Xuan Zhang
Qiuwei Wu
67
1
0
05 Aug 2025
TopoImages: Incorporating Local Topology Encoding into Deep Learning Models for Medical Image Classification
Pengfei Gu
Hongxiao Wang
Yejia Zhang
Huimin Li
Chaoli Wang
Danny Chen
96
2
0
03 Aug 2025
Evading Data Provenance in Deep Neural Networks
Hongyu Zhu
Sichu Liang
Wenwen Wang
Zhuomeng Zhang
Fangqi Li
Shi-Lin Wang
AAML
247
1
0
01 Aug 2025
Detection Transformers Under the Knife: A Neuroscience-Inspired Approach to Ablations
Nils Hütten
Florian Hölken
Hasan Tercan
Tobias Meisen
MedIm
156
0
0
29 Jul 2025
Can Foundation Models Predict Fitness for Duty?
Juan E. Tapia
Christoph Busch
58
0
0
27 Jul 2025
VAMPIRE: Uncovering Vessel Directional and Morphological Information from OCTA Images for Cardiovascular Disease Risk Factor Prediction
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Lehan Wang
Hualiang Wang
Chubin Ou
Lushi Chen
Yunyi Liang
Xiaomeng Li
126
0
0
26 Jul 2025
Iwin Transformer: Hierarchical Vision Transformer using Interleaved Windows
Simin Huo
Ning Li
ViT
228
0
0
24 Jul 2025
DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD
Xianbiao Qi
Marco Chen
Wenjie Xiao
Jiaquan Ye
Yelin He
Chun-Guang Li
Zhouchen Lin
OffRL
129
0
0
23 Jul 2025
CLARIFID: Improving Radiology Report Generation by Reinforcing Clinically Accurate Impressions and Enforcing Detailed Findings
Kyeongkyu Lee
Seonghwan Yoon
Hongki Lim
MedIm
270
0
0
23 Jul 2025
IONext: Unlocking the Next Era of Inertial Odometry
Shanshan Zhang
Qi Zhang
Siyue Wang
Tianshui Wen
Liqin Wu
Ziheng Zhou
Xuemin Hong
Ao Peng
Lingxiang Zheng
Yu Yang
150
0
0
23 Jul 2025
A High Magnifications Histopathology Image Dataset for Oral Squamous Cell Carcinoma Diagnosis and Prognosis
Jinquan Guan
Junhong Guo
Qi Chen
Jian Chen
Y. Cai
Yilin He
Z. Huang
Yan Wang
Yutong Xie
143
0
0
22 Jul 2025
Pixel-Resolved Long-Context Learning for Turbulence at Exascale: Resolving Small-scale Eddies Toward the Viscous Limit
Junqi Yin
Mijanur Palash
M. Paul Laiu
Muralikrishnan Gopalakrishnan Meena
Ravi Tandon
S. D. B. Kops
Feiyi Wang
Ramanan Sankaran
Pei Zhang
113
1
0
22 Jul 2025
DeSamba: Decoupled Spectral Adaptive Framework for 3D Multi-Sequence MRI Lesion Classification
Dezhen Wang
Sheng Miao
Rongxin Chai
Jiufa Cui
Mamba
214
0
0
21 Jul 2025
MedSR-Impact: Transformer-Based Super-Resolution for Lung CT Segmentation, Radiomics, Classification, and Prognosis
M. Martell
K. Linton-Reid
Mitchell Chen
Sumeet Hindocha
Benjamin Hunter
Marco A. Calzado
Richard Lee
J. Posma
E. Aboagye
MedIm
116
0
0
21 Jul 2025
Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking
Yuan Yao
Jin Song
Jian Jin
AAML
176
1
0
15 Jul 2025
ThinkingViT: Matryoshka Thinking Vision Transformer for Elastic Inference
A. Hojjat
Janek Haberer
Soren Pirk
Olaf Landsiedel
ViT
LRM
208
2
0
14 Jul 2025
ViFusionTST: Deep Fusion of Time-Series Image Representations from Load Signals for Early Bed-Exit Prediction
Hao Liu
Yu Hu
Rakiba Rayhana
Ling Bai
Zheng Liu
164
0
0
25 Jun 2025
AeroGPT: Leveraging Large-Scale Audio Model for Aero-Engine Bearing Fault Diagnosis
Jiale Liu
Dandan Peng
Huan Wang
Chenyu Liu
Yan-Fu Li
Min Xie
151
0
0
19 Jun 2025
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
Guohuan Xie
Syed Ariff Syed Hesham
Wenya Guo
Bing Li
Ming-Ming Cheng
Guolei Sun
Yun-Hai Liu
158
1
0
16 Jun 2025
LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Haotian Zhang
Liu Liu
Baosheng Yu
Jiayan Qiu
Yanwei Ren
Xianglong Liu
186
0
0
14 Jun 2025
Generalist Models in Medical Image Segmentation: A Survey and Performance Comparison with Task-Specific Approaches
Information Fusion (Inf. Fusion), 2025
Andrea Moglia
Matteo Leccardi
Matteo Cavicchioli
Alice Maccarini
Marco Marcon
Luca Mainardi
Pietro Cerveri
MedIm
LM&MA
VLM
243
1
0
12 Jun 2025
DeepTraverse: A Depth-First Search Inspired Network for Algorithmic Visual Understanding
Bin Guo
John H.L. Hansen
222
1
0
11 Jun 2025
SemanticSplat: Feed-Forward 3D Scene Understanding with Language-Aware Gaussian Fields
Qijing Li
Jingxiang Sun
Liang An
Zhaoqi Su
Hongwen Zhang
Yebin Liu
191
0
0
11 Jun 2025
Canonical Latent Representations in Conditional Diffusion Models
Yitao Xu
Tong Zhang
Ehsan Pajouheshgar
Sabine Süsstrunk
DiffM
242
0
0
11 Jun 2025
MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language Models
Philip R. Liu
Sparsh Bansal
Jimmy Dinh
Aditya Pawar
Ramani Satishkumar
Shail Desai
Neeraj Gupta
X. Wang
S. Hu
LM&MA
176
3
0
09 Jun 2025
Can Foundation Models Generalise the Presentation Attack Detection Capabilities on ID Cards?
Juan E. Tapia
Christoph Busch
209
1
0
05 Jun 2025
Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter Era
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Dan Oneaţă
Desmond Elliott
Stella Frank
183
2
0
04 Jun 2025
FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution
Qiusheng Huang
Yuan Niu
Xiaohui Zhong
Anboyu Guo
Lei Chen
Dianjun Zhang
Xuefeng Zhang
Hao Li
AI4Cl
208
0
0
03 Jun 2025
RoadFormer : Local-Global Feature Fusion for Road Surface Classification in Autonomous Driving
Tianze Wang
Zhang Zhang
Chao Sun
180
1
0
03 Jun 2025
Learning Sparsity for Effective and Efficient Music Performance Question Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Xingjian Diao
Tianzhen Yang
Chunhui Zhang
Weiyi Wu
Ming Cheng
Jiang Gui
214
6
0
02 Jun 2025
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
Benjamin Holzschuh
Qiang Liu
Georg Kohl
Nils Thuerey
AI4CE
244
7
0
30 May 2025
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization
Jiahao Cui
Yan Chen
Mingwang Xu
Hanlin Shang
Yuxuan Chen
Yun Zhan
Zilong Dong
Yao Yao
Jingdong Wang
Siyu Zhu
DiffM
VGen
532
8
0
29 May 2025
FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models
Nils Neukirch
Johanna Vielhaben
Nils Strodthoff
DiffM
236
1
0
27 May 2025
The Missing Point in Vision Transformers for Universal Image Segmentation
Sajjad Shahabodini
Mobina Mansoori
Farnoush Bayatmakou
J. Abouei
Konstantinos N. Plataniotis
Arash Mohammadi
ViT
ISeg
288
0
0
26 May 2025
Towards Fully FP8 GEMM LLM Training at Scale
Alejandro Hernández Cano
Dhia Garbaya
Imanol Schlag
Martin Jaggi
MQ
326
2
0
26 May 2025
Asymmetric Duos: Sidekicks Improve Uncertainty
Tim G. Zhou
Evan Shelhamer
Geoff Pleiss
UQCV
429
0
0
24 May 2025
Previous
1
2
3
4
5
...
17
18
19
Next