ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
v1v2 (latest)

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (14834★)

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 932 papers shown
Mahalanobis++: Improving OOD Detection via Feature Normalization
Mahalanobis++: Improving OOD Detection via Feature Normalization
Maximilian Mueller
Matthias Hein
OODD
342
7
0
23 May 2025
RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection
RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge InjectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Wenjun Hou
Yi Cheng
Kaishuai Xu
Heng Li
Yan Hu
Wenjie Li
Jiang Liu
437
5
0
20 May 2025
EGFormer: Towards Efficient and Generalizable Multimodal Semantic Segmentation
EGFormer: Towards Efficient and Generalizable Multimodal Semantic Segmentation
Zelin Zhang
Tao Zhang
KediLI
Xu Zheng
192
0
0
20 May 2025
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Mamba-Adaptor: State Space Model Adaptor for Visual RecognitionComputer Vision and Pattern Recognition (CVPR), 2025
Fei Xie
Jiahao Nie
Yujin Tang
W. Zhang
Hongshen Zhao
Mamba
362
0
0
19 May 2025
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series
Xiaolei Qin
Haiyan Zhao
Jing Zhang
Fengxiang Wang
Xin Su
Bo Du
Liangpei Zhang
AI4TS
375
0
0
13 May 2025
SynID: Passport Synthetic Dataset for Presentation Attack Detection
SynID: Passport Synthetic Dataset for Presentation Attack Detection
Juan E. Tapia
Fabian Stockhardt
Lázaro J. González Soler
Christoph Busch
298
3
0
12 May 2025
Adapting a Segmentation Foundation Model for Medical Image Classification
Adapting a Segmentation Foundation Model for Medical Image Classification
Pengfei Gu
Haoteng Tang
Islam A. Ebeid
Jose Angel Nuñez
Fabian Vazquez
Diego Adame
Marcus Zhan
Huimin Li
Bin Fu
Benlin Liu
MedImVLM
176
0
0
09 May 2025
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
Xiao Wang
Jong Youl Choi
Takuya Kurihaya
Isaac Lyngaas
Hong-Jun Yoon
...
Dali Wang
Peter Thornton
Prasanna Balaprakash
M. Ashfaq
Dan Lu
266
2
0
07 May 2025
Stow: Robotic Packing of Items into Fabric Pods
Stow: Robotic Packing of Items into Fabric Pods
Nicolas Hudson
Josh Hooks
Rahul Warrier
Curt Salisbury
Ross Hartley
...
Christine Fuller
Alex Keklak
Alex Frenkel
Lillian J. Ratliff
Aaron Parness
224
1
0
07 May 2025
Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation
Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation
Yi Lin
Dong Zhang
X. B. Fang
Yufan Chen
K.-T. Cheng
Hao Chen
220
6
0
06 May 2025
SCOPE-MRI: Bankart Lesion Detection as a Case Study in Data Curation and Deep Learning for Challenging Diagnoses
SCOPE-MRI: Bankart Lesion Detection as a Case Study in Data Curation and Deep Learning for Challenging Diagnoses
Sahil Sethi
Sai Reddy
Mansi Sakarvadia
Jordan Serotte
Darlington Nwaudo
Nicholas Maassen
Lewis Shi
215
0
0
29 Apr 2025
Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR
Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR
Baoshun Shi
Bing Chen
Shaolei Zhang
Huazhu Fu
Zhanli Hu
MedIm
273
0
0
28 Apr 2025
Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models
Examining the Impact of Optical Aberrations to Image Classification and Object Detection ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Patrick Müller
Alexander Braun
Margret Keuper
278
2
0
25 Apr 2025
High-Quality Cloud-Free Optical Image Synthesis Using Multi-Temporal SAR and Contaminated Optical Data
High-Quality Cloud-Free Optical Image Synthesis Using Multi-Temporal SAR and Contaminated Optical Data
Chenxi Duan
237
0
0
23 Apr 2025
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
M. Chowdhury
Md Rifat Ur Rahman
Akil Ahmad Taki
225
0
0
19 Apr 2025
BeetleVerse: A Study on Taxonomic Classification of Ground Beetles
BeetleVerse: A Study on Taxonomic Classification of Ground Beetles
S M Rayeed
Alyson East
Samuel Stevens
Sydne Record
Charles V. Stewart
183
2
0
18 Apr 2025
Learning from Noisy Pseudo-labels for All-Weather Land Cover Mapping
Learning from Noisy Pseudo-labels for All-Weather Land Cover Mapping
Wang Liu
Zhiyu Wang
Xin Guo
Puhong Duan
Xudong Kang
Shutao Li
239
4
0
18 Apr 2025
Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design
Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design
Wei Dong
Yan Min
Han Zhou
Jun Chen
ViT
231
3
0
18 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjDVOS
666
107
0
17 Apr 2025
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
Xin Li
Kun Yuan
B. Li
Fengbin Guan
Yizhen Shao
...
Guohua Zhang
Z. Huang
Y. Deng
Qingmiao Jiang
Lu Chen
318
22
0
17 Apr 2025
SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling
SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling
Yasin Almalioglu
Andrzej Kucik
Geoffrey French
Dafni Antotsiou
Alexander Adam
Cedric Archambeau
287
1
0
17 Apr 2025
Plain Transformers Can be Powerful Graph Learners
Plain Transformers Can be Powerful Graph Learners
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
Juil Sock
Mark Coates
340
0
0
17 Apr 2025
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Tao Wen
Jiadong Wang
Yuxiao Chen
Shugong Xu
Fangqiu Yi
Xuelong Li
MDE
336
0
0
16 Apr 2025
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large ImagesComputer Vision and Pattern Recognition (CVPR), 2025
Jiuchen Chen
Xinyu Yan
Qizhi Xu
Kaiqi Li
VLM
221
3
0
13 Apr 2025
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Silong Yong
Yuhao Wang
Bin Yan
Huchuan Lu
324
5
0
13 Apr 2025
Hyperlocal disaster damage assessment using bi-temporal street-view imagery and pre-trained vision models
Hyperlocal disaster damage assessment using bi-temporal street-view imagery and pre-trained vision modelsComputers, Environment and Urban Systems (CEUS), 2025
Yifan Yang
Lei Zou
Bing Zhou
Daoyang Li
Binbin Lin
J. Abedin
Mingzheng Yang
162
4
0
12 Apr 2025
Mixture of Group Experts for Learning Invariant Representations
Mixture of Group Experts for Learning Invariant Representations
Lei Kang
Jia Li
Mi Tian
Hua Huang
MoE
355
0
0
12 Apr 2025
Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases
Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases
Andrés Bell-Navas
M. Villalba-Orero
Enrique Lara Pezzi
J. Garicano-Mena
S. L. Clainche
399
1
0
10 Apr 2025
Audio-visual Event Localization on Portrait Mode Short Videos
Audio-visual Event Localization on Portrait Mode Short Videos
Wuyang Liu
Yi Chai
Yongpeng Yan
Yanzhen Ren
306
1
0
09 Apr 2025
A Robust Real-Time Lane Detection Method with Fog-Enhanced Feature Fusion for Foggy Conditions
A Robust Real-Time Lane Detection Method with Fog-Enhanced Feature Fusion for Foggy Conditions
Ronghui Zhang
Yuhang Ma
Tengfei Li
Ziyu Lin
Yueying Wu
Junzhou Chen
Lin Zhang
Jia Hu
Tony Z. Qiu
Konghui Guo
614
1
0
08 Apr 2025
EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection
EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection
Muhammad Ahmed Ullah Khan
Abdul Hannan Khan
Andreas Dengel
236
0
0
05 Apr 2025
Spline-based Transformers
Spline-based TransformersEuropean Conference on Computer Vision (ECCV), 2025
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
362
0
0
03 Apr 2025
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Andrei Dumitriu
Florin Tatui
Florin Miron
Radu Tudor Ionescu
Radu Timofte
386
34
0
03 Apr 2025
FLAMES: A Hybrid Spiking-State Space Model for Adaptive Memory Retention in Event-Based Learning
FLAMES: A Hybrid Spiking-State Space Model for Adaptive Memory Retention in Event-Based Learning
Biswadeep Chakraborty
Saibal Mukhopadhyay
453
0
0
02 Apr 2025
rPPG-SysDiaGAN: Systolic-Diastolic Feature Localization in rPPG Using Generative Adversarial Network with Multi-Domain Discriminator
rPPG-SysDiaGAN: Systolic-Diastolic Feature Localization in rPPG Using Generative Adversarial Network with Multi-Domain Discriminator
Banafsheh Adami
Nima Karimian
231
3
0
01 Apr 2025
GRU-AUNet: A Domain Adaptation Framework for Contactless Fingerprint Presentation Attack Detection
GRU-AUNet: A Domain Adaptation Framework for Contactless Fingerprint Presentation Attack DetectionSilicon Valley Cybersecurity Conference (SVCC), 2025
Banafsheh Adami
Nima Karimian
195
2
0
01 Apr 2025
LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification
LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification
Xiang Hu
Yuhao Wang
Silong Yong
Huchuan Lu
VLM
424
5
0
31 Mar 2025
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Junzhu Mao
Yang Shen
Jinyang Guo
Yazhou Yao
Xiansheng Hua
ViT
359
2
0
30 Mar 2025
FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling
FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling
Qiusheng Huang
Xiaohui Zhong
Xu Fan
Lei Chen
Hao Li
AI4TSAI4CE
303
0
0
25 Mar 2025
Data-driven Mesoscale Weather Forecasting Combining Swin-Unet and Diffusion Models
Data-driven Mesoscale Weather Forecasting Combining Swin-Unet and Diffusion Models
Yuta Hirabayashi
Daisuke Matsuoka
DiffM
188
0
0
25 Mar 2025
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge DistillationComputer Vision and Pattern Recognition (CVPR), 2025
Jungsoo Lee
Debasmit Das
Munawar Hayat
Sungha Choi
Kyuwoong Hwang
Fatih Porikli
VLM
273
3
0
23 Mar 2025
Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration
Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration
Yawei Li
Bin Ren
Christos Sakaridis
Rakesh Ranjan
Mengyuan Liu
Andrii Zadaianchuk
Ming-Hsuan Yang
Luca Benini
348
0
0
22 Mar 2025
Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?
Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?
Robin Hesse
Doğukan Bağcı
Bernt Schiele
Simone Schaub-Meyer
Stefan Roth
VLM
449
0
0
21 Mar 2025
From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed Learning
From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed LearningComputer Vision and Pattern Recognition (CVPR), 2025
Ziang Li
Hongguang Zhang
Juan Wang
Meihui Chen
Hongxin Hu
Wenzhe Yi
Xiaoyang Xu
Mengda Yang
Chenjun Ma
391
3
0
20 Mar 2025
LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding
LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding
Amirhossein Kazerouni
Soroush Mehraban
Michael Brudno
Babak Taati
280
3
0
19 Mar 2025
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Shristi Das Biswas
Efstathia Soufleri
Arani Roy
Kaushik Roy
331
1
0
17 Mar 2025
CLIP-Free, Label-Free, Zero-Shot Concept Bottleneck Models
CLIP-Free, Label-Free, Zero-Shot Concept Bottleneck Models
Fawaz Sammani
Jonas Fischer
Nikos Deligiannis
VLM
221
0
0
14 Mar 2025
MEET: A Million-Scale Dataset for Fine-Grained Geospatial Scene Classification with Zoom-Free Remote Sensing ImageryIEEE/CAA Journal of Automatica Sinica (IEEE/CAA J. Autom. Sin.), 2025
Yansheng Li
Yuning Wu
Gong Cheng
Chao Tao
Bo Dang
...
Chuxu Zhang
Wenshu Fan
Xianfeng Tang
Jiayi Ma
Yongjun Zhang
204
18
0
14 Mar 2025
HeightFormer: Learning Height Prediction in Voxel Features for Roadside Vision Centric 3D Object Detection via Transformer
Zhang Zhang
Chao Sun
Chao Yue
Da Wen
Yujie Chen
Tianze Wang
Jianghao Leng
ViT
402
2
0
13 Mar 2025
Rethinking Two-Stage Referring-by-Tracking in Referring Multi-Object Tracking: Make it Strong Again
Rethinking Two-Stage Referring-by-Tracking in Referring Multi-Object Tracking: Make it Strong Again
Weize Li
Yunhao Du
Qixiang Yin
Zhicheng Zhao
Fei Su
381
0
0
10 Mar 2025
Previous
123456...171819
Next