ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
v1v2 (latest)

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (14834★)

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 931 papers shown
Title
Evaluating SAM2 for Video Semantic Segmentation
Evaluating SAM2 for Video Semantic Segmentation
Syed Hesham Syed Ariff
Yun Liu
Guolei Sun
Jing Yang
Henghui Ding
Xue Geng
Xudong Jiang
VLM
155
0
0
01 Dec 2025
When Do Domain-Specific Foundation Models Justify Their Cost? A Systematic Evaluation Across Retinal Imaging Tasks
When Do Domain-Specific Foundation Models Justify Their Cost? A Systematic Evaluation Across Retinal Imaging Tasks
David Isztl
Tahm Spitznagel
Gabor M. Somfai
Rui Santos
MedImVLM
128
0
0
27 Nov 2025
MoLT: Mixture of Layer-Wise Tokens for Efficient Audio-Visual Learning
Kyeongha Rho
Hyeongkeun Lee
Jae-Won Cho
Joon Son Chung
25
0
0
27 Nov 2025
Cross-Contrastive Clustering for Multimodal Attributed Graphs with Dual Graph Filtering
Cross-Contrastive Clustering for Multimodal Attributed Graphs with Dual Graph Filtering
Haoran Zheng
Renchi Yang
Hongtao Wang
Jianliang Xu
104
0
0
25 Nov 2025
ACIT: Attention-Guided Cross-Modal Interaction Transformer for Pedestrian Crossing Intention Prediction
ACIT: Attention-Guided Cross-Modal Interaction Transformer for Pedestrian Crossing Intention Prediction
Yuanzhe Li
Steffen Müller
ViT
144
0
0
25 Nov 2025
Glass Surface Detection: Leveraging Reflection Dynamics in Flash/No-flash Imagery
Glass Surface Detection: Leveraging Reflection Dynamics in Flash/No-flash Imagery
Tao Yan
Hao Huang
Yiwei Lu
Zeyu Wang
Ke Xu
Yinghui Wang
Xiaojun Chang
Rynson W. H. Lau
84
0
0
21 Nov 2025
A Dataset and Baseline for Deep Learning-Based Visual Quality Inspection in Remanufacturing
A Dataset and Baseline for Deep Learning-Based Visual Quality Inspection in RemanufacturingIEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 2025
Johannes C. Bauer
Paul Geng
Stephan Trattnig
Petr Dokládal
Rüdiger Daub
68
0
0
19 Nov 2025
AdamNX: An Adam improvement algorithm based on a novel exponential decay mechanism for the second-order moment estimate
AdamNX: An Adam improvement algorithm based on a novel exponential decay mechanism for the second-order moment estimate
Meng Zhu
Quan Xiao
Weidong Min
247
0
0
17 Nov 2025
MSLoRA: Multi-Scale Low-Rank Adaptation via Attention Reweighting
MSLoRA: Multi-Scale Low-Rank Adaptation via Attention Reweighting
Xu Yang
Gady Agam
102
0
0
16 Nov 2025
From Street to Orbit: Training-Free Cross-View Retrieval via Location Semantics and LLM Guidance
From Street to Orbit: Training-Free Cross-View Retrieval via Location Semantics and LLM Guidance
Jeongho Min
Dongyoung Kim
J. Lee
186
0
0
12 Nov 2025
WEDepth: Efficient Adaptation of World Knowledge for Monocular Depth Estimation
WEDepth: Efficient Adaptation of World Knowledge for Monocular Depth Estimation
Gongshu Wang
Zhirui Wang
Kan Yang
MDEVLM
109
0
0
11 Nov 2025
Hilbert-Guided Block-Sparse Local Attention
Hilbert-Guided Block-Sparse Local Attention
Yunge Li
Lanyu Xu
80
0
0
08 Nov 2025
CoMA: Complementary Masking and Hierarchical Dynamic Multi-Window Self-Attention in a Unified Pre-training Framework
CoMA: Complementary Masking and Hierarchical Dynamic Multi-Window Self-Attention in a Unified Pre-training Framework
Jiaxuan Li
Qing Xu
Xiangjian He
Ziyu Liu
Chang Xing
Zhen Chen
Daokun Zhang
Rong Qu
Chang Wen Chen
88
0
0
08 Nov 2025
Differentiable Hierarchical Visual Tokenization
Differentiable Hierarchical Visual Tokenization
Marius Aasan
Martine Hjelkrem-Tan
Nico Catalano
Changkyu Choi
Adín Ramirez Rivera
184
0
0
04 Nov 2025
SAFE: A Novel Approach to AI Weather Evaluation through Stratified Assessments of Forecasts over Earth
SAFE: A Novel Approach to AI Weather Evaluation through Stratified Assessments of Forecasts over Earth
Nick Masi
Randall Balestriero
97
0
0
30 Oct 2025
Leveraging an Atmospheric Foundational Model for Subregional Sea Surface Temperature Forecasting
Leveraging an Atmospheric Foundational Model for Subregional Sea Surface Temperature Forecasting
Víctor Medina
Giovanny C-Londoño
Javier Sánchez
AI4Cl
428
0
0
29 Oct 2025
Attentive Convolution: Unifying the Expressivity of Self-Attention with Convolutional Efficiency
Attentive Convolution: Unifying the Expressivity of Self-Attention with Convolutional Efficiency
Hao Yu
H. G. Chen
Yan Jiang
Wei Peng
Zhaodong Sun
Samuel Kaski
Guoying Zhao
133
0
0
23 Oct 2025
Interactive Hypergraph Visual Analytics for Exploring Large and Complex Image Collections
Interactive Hypergraph Visual Analytics for Exploring Large and Complex Image Collections
Floris Gisolf
Z. Geradts
M. Worring
79
0
0
22 Oct 2025
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
Soroush Tabesh
M. Safaryan
Dan Alistarh
Alexandra Volkova
Dan Alistarh
MQ
191
0
0
21 Oct 2025
Towards Generalist Intelligence in Dentistry: Vision Foundation Models for Oral and Maxillofacial Radiology
Towards Generalist Intelligence in Dentistry: Vision Foundation Models for Oral and Maxillofacial Radiology
Xinrui Huang
Fan Xiao
Dongming He
Anqi Gao
Dandan Li
Xiaofan Zhang
Shaoting Zhang
Xudong Wang
MedImLM&MA
189
0
0
16 Oct 2025
MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching
MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching
Tingman Yan
Tao Liu
Xilian Yang
Qunfei Zhao
Zeyang Xia
3DV
179
0
0
16 Oct 2025
SkyDreamer: Interpretable End-to-End Vision-Based Drone Racing with Model-Based Reinforcement Learning
SkyDreamer: Interpretable End-to-End Vision-Based Drone Racing with Model-Based Reinforcement Learning
Aderik Verraest
Stavrow A. Bahnam
Robin Ferede
Guido C. H. E de Croon
Christophe De Wagter
118
1
0
16 Oct 2025
On the Use of Hierarchical Vision Foundation Models for Low-Cost Human Mesh Recovery and Pose Estimation
On the Use of Hierarchical Vision Foundation Models for Low-Cost Human Mesh Recovery and Pose Estimation
Shuhei Tarashima
Yushan Wang
Norio Tagawa
3DH
163
0
0
14 Oct 2025
CuMPerLay: Learning Cubical Multiparameter Persistence Vectorizations
CuMPerLay: Learning Cubical Multiparameter Persistence Vectorizations
Caner Korkmaz
Brighton Nuwagira
Barış Coşkunuzer
Tolga Birdal
96
3
0
14 Oct 2025
DREAM: A Benchmark Study for Deepfake REalism AssessMent
DREAM: A Benchmark Study for Deepfake REalism AssessMent
Bo Peng
Zichuan Wang
Sheng Yu
Xiaochuan Jin
Wei Wang
Jing Dong
EGVM
147
0
0
11 Oct 2025
AUREXA-SE: Audio-Visual Unified Representation Exchange Architecture with Cross-Attention and Squeezeformer for Speech Enhancement
AUREXA-SE: Audio-Visual Unified Representation Exchange Architecture with Cross-Attention and Squeezeformer for Speech Enhancement
M. Sajid
Deepanshu Gupta
Yash Modi
Sanskriti Jain
Harshith Jai Surya Ganji
A. Rahaman
Harshvardhan Choudhary
Nasir Saleem
Amir Hussain
M. Tanveer
72
0
0
06 Oct 2025
A Comprehensive Review on Artificial Intelligence Empowered Solutions for Enhancing Pedestrian and Cyclist Safety
A Comprehensive Review on Artificial Intelligence Empowered Solutions for Enhancing Pedestrian and Cyclist Safety
Shucheng Zhang
Yan Shi
Bingzhang Wang
Yuang Zhang
Muhammad Monjurul Karim
Kehua Chen
Chenxi Liu
Mehrdad Nasri
Yinhai Wang
139
0
0
30 Sep 2025
Unsupervised Detection of Spatiotemporal Anomalies in PMU Data Using Transformer-Based BiGAN
Unsupervised Detection of Spatiotemporal Anomalies in PMU Data Using Transformer-Based BiGAN
Muhammad Imran Hossain
Jignesh Solanki
S. K. Solanki
52
0
0
30 Sep 2025
Swift: An Autoregressive Consistency Model for Efficient Weather Forecasting
Swift: An Autoregressive Consistency Model for Efficient Weather Forecasting
Jason Stock
T. Arcomano
R. Kotamarthi
DiffM
145
3
0
30 Sep 2025
BRIDGE -- Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation
BRIDGE -- Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation
Dingning Liu
Haoyu Guo
Jingyi Zhou
Tong He
OffRLMDE
272
0
0
29 Sep 2025
Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy
Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy
Haijier Chen
Bo Xu
Shoujian Zhang
Haoze Liu
Jiaxuan Lin
Jingrong Wang
LRM
130
1
0
29 Sep 2025
DRIFT-Net: A Spectral--Coupled Neural Operator for PDEs Learning
DRIFT-Net: A Spectral--Coupled Neural Operator for PDEs Learning
Jiayi Li
Flora D. Salim
88
0
0
29 Sep 2025
Variable Rate Image Compression via N-Gram Context based Swin-transformer
Variable Rate Image Compression via N-Gram Context based Swin-transformer
Priyanka Mudgal
ViT
140
0
0
28 Sep 2025
Beyond Outliers: A Study of Optimizers Under Quantization
Beyond Outliers: A Study of Optimizers Under Quantization
Georgios Vlassis
Saleh Ashkboos
Alexandra Volkova
Torsten Hoefler
Dan Alistarh
MQ
148
0
0
27 Sep 2025
HyPSAM: Hybrid Prompt-driven Segment Anything Model for RGB-Thermal Salient Object Detection
HyPSAM: Hybrid Prompt-driven Segment Anything Model for RGB-Thermal Salient Object Detection
Ruichao Hou
Xingyuan Li
Tongwei Ren
Dongming Zhou
Gangshan Wu
Jinde Cao
48
0
0
23 Sep 2025
A Validation Strategy for Deep Learning Models: Evaluating and Enhancing Robustness
A Validation Strategy for Deep Learning Models: Evaluating and Enhancing Robustness
Abdul-Rauf Nuhu
Parham Kebria
Vahid Hemmati
Benjamin Lartey
M. N. Mahmoud
A. Homaifar
E. Tunstel
164
0
0
23 Sep 2025
PMRT: A Training Recipe for Fast, 3D High-Resolution Aerodynamic Prediction
PMRT: A Training Recipe for Fast, 3D High-Resolution Aerodynamic Prediction
Sam Jacob Jacob
Markus Mrosek
C. Othmer
Harald Köstler
DiffMAI4CE
116
0
0
21 Sep 2025
Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
Qishuai Wen
Zhiyuan Huang
Chun-Guang Li
MQ
323
0
0
21 Sep 2025
Random Direct Preference Optimization for Radiography Report Generation
Random Direct Preference Optimization for Radiography Report Generation
Valentin Samokhin
B. Shirokikh
M. Goncharov
Dmitriy Umerenkov
Maksim Bobrin
Ivan Oseledets
Dmitry V. Dylov
Mikhail Belyaev
72
0
0
19 Sep 2025
Sequential Token Merging: Revisiting Hidden States
Sequential Token Merging: Revisiting Hidden States
Yan Wen
Peng Ye
Lin Zhang
Baopu Li
Jiakang Yuan
Yaoxin Yang
Tao Chen
Mamba
124
0
0
19 Sep 2025
CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction
CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction
Yiyi Liu
Chunyang Liu
Bohan Wang
Weiqin Jiao
Bojian Wu
Lubin Fan
Yuwei Chen
Fashuai Li
Biao Xiong
3DV
149
0
0
18 Sep 2025
Region-Aware Deformable Convolutions
Region-Aware Deformable Convolutions
Abolfazl Saheban Maleki
Maryam Imani
130
0
0
18 Sep 2025
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions
Väinö Hatanpää
Eugene Ku
Jason Stock
M. Emani
Sam Foreman
...
Sam Wheeler
Huihuo Zheng
T. Arcomano
V. Vishwanath
R. Kotamarthi
136
1
0
16 Sep 2025
MFAF: An EVA02-Based Multi-scale Frequency Attention Fusion Method for Cross-View Geo-Localization
MFAF: An EVA02-Based Multi-scale Frequency Attention Fusion Method for Cross-View Geo-Localization
YiTong Liu
Tianzhu Liu
Yanfeng Gu
108
0
0
16 Sep 2025
LoRA-fine-tuned Large Vision Models for Automated Assessment of Post-SBRT Lung Injury
LoRA-fine-tuned Large Vision Models for Automated Assessment of Post-SBRT Lung Injury
M. Bolhassani
B. Veasey
E. Daugherty
S. Keltner
N. Kumar
N. Dunlap
A. Amini
36
0
0
15 Sep 2025
CoAtNeXt:An Attention-Enhanced ConvNeXtV2-Transformer Hybrid Model for Gastric Tissue Classification
CoAtNeXt:An Attention-Enhanced ConvNeXtV2-Transformer Hybrid Model for Gastric Tissue Classification
Mustafa Yurdakul
Şakir Tasdemir
60
0
0
11 Sep 2025
Value bounds and Convergence Analysis for Averages of LRP attributions
Value bounds and Convergence Analysis for Averages of LRP attributions
Alexander Binder
Nastaran Takmil-Homayouni
Ürün Dogan
FAtt
200
0
0
10 Sep 2025
Learning spatially structured open quantum dynamics with regional-attention transformers
Learning spatially structured open quantum dynamics with regional-attention transformers
Dounan Du
Eden Figueroa
AI4CE
56
0
0
08 Sep 2025
IGAff: Benchmarking Adversarial Iterative and Genetic Affine Algorithms on Deep Neural Networks
IGAff: Benchmarking Adversarial Iterative and Genetic Affine Algorithms on Deep Neural Networks
Sebastian-Vasile Echim
Andrei Preda
Dumitru-Clementin Cercel
Florin-Catalin Pop
AAML
104
0
0
08 Sep 2025
Dynamic Group Detection using VLM-augmented Temporal Groupness Graph
Dynamic Group Detection using VLM-augmented Temporal Groupness Graph
Kaname Yokoyama
Chihiro Nakatani
Norimichi Ukita
74
0
0
05 Sep 2025
1234...171819
Next