ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
v1v2 (latest)

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (14834★)

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 933 papers shown
LieRE: Lie Rotational Positional Encodings
LieRE: Lie Rotational Positional Encodings
Sophie Ostmeier
Brian Axelrod
Michael E. Moseley
Akshay S. Chaudhari
Akshay Chaudhari
C. Langlotz
358
1
0
14 Jun 2024
Depth Anything V2
Depth Anything V2
Lihe Yang
Bingyi Kang
Zilong Huang
Zhen Zhao
Xiaohan Li
Jiashi Feng
Hengshuang Zhao
DiffMVLMMDE
368
1,111
0
13 Jun 2024
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks
  and Algorithms
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Miaosen Zhang
Yixuan Wei
Zhen Xing
Yifei Ma
Zuxuan Wu
...
Zheng Zhang
Jingdong Sun
Chong Luo
Xin Geng
Baining Guo
VLM
291
2
0
13 Jun 2024
Unveiling Incomplete Modality Brain Tumor Segmentation: Leveraging
  Masked Predicted Auto-Encoder and Divergence Learning
Unveiling Incomplete Modality Brain Tumor Segmentation: Leveraging Masked Predicted Auto-Encoder and Divergence Learning
Zhongao Sun
Jiameng Li
Yuhan Wang
Jiarong Cheng
Qing Zhou
Chun Li
MedIm
262
1
0
12 Jun 2024
ProTrain: Efficient LLM Training via Memory-Aware Techniques
ProTrain: Efficient LLM Training via Memory-Aware Techniques
Hanmei Yang
Jin Zhou
Yao Fu
Xiaoqun Wang
Ramine Roane
Hui Guan
Tongping Liu
VLM
235
4
0
12 Jun 2024
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision
  Transformer
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu
Tong Zhang
Sabine Süsstrunk
ViT
391
2
0
12 Jun 2024
A Robust Pipeline for Classification and Detection of Bleeding Frames in
  Wireless Capsule Endoscopy using Swin Transformer and RT-DETR
A Robust Pipeline for Classification and Detection of Bleeding Frames in Wireless Capsule Endoscopy using Swin Transformer and RT-DETR
Sasidhar Alavala
Anil Kumar Vadde
Aparnamala Kancheti
Subrahmanyam Gorthi
ViTMedIm
64
2
0
12 Jun 2024
Towards Fundamentally Scalable Model Selection: Asymptotically Fast
  Update and Selection
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection
Wenxiao Wang
Weiming Zhuang
Lingjuan Lyu
284
0
0
11 Jun 2024
ReduceFormer: Attention with Tensor Reduction by Summation
ReduceFormer: Attention with Tensor Reduction by Summation
John Yang
Le An
Su Inn Park
166
0
0
11 Jun 2024
A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion
A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion
Xiaoli Zhang
Liying Wang
Libo Zhao
Xiongfei Li
Siwei Ma
515
1
0
11 Jun 2024
Multiplane Prior Guided Few-Shot Aerial Scene Rendering
Multiplane Prior Guided Few-Shot Aerial Scene RenderingComputer Vision and Pattern Recognition (CVPR), 2024
Zihan Gao
Licheng Jiao
Lingling Li
Xu Liu
Fan Liu
Puhua Chen
Yuwei Guo
272
4
0
07 Jun 2024
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path PlanningIEEE Access (IEEE Access), 2024
Arvi Jonnarth
Ola Johansson
Jie Zhao
Michael Felsberg
OffRL
435
5
0
07 Jun 2024
PALM: A Efficient Performance Simulator for Tiled Accelerators with
  Large-scale Model Training
PALM: A Efficient Performance Simulator for Tiled Accelerators with Large-scale Model Training
Jiahao Fang
Huizheng Wang
Qize Yang
Dehao Kong
Xu Dai
Jinyi Deng
Yang Hu
Shouyi Yin
201
3
0
06 Jun 2024
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
Dujian Ding
Bicheng Xu
L. Lakshmanan
VLM
274
3
0
06 Jun 2024
LADI v2: Multi-label Dataset and Classifiers for Low-Altitude Disaster
  Imagery
LADI v2: Multi-label Dataset and Classifiers for Low-Altitude Disaster Imagery
Samuel Scheele
Katherine Picchione
Jeffrey Liu
139
1
0
04 Jun 2024
Generative Active Learning for Long-tailed Instance Segmentation
Generative Active Learning for Long-tailed Instance Segmentation
Huanyi Zheng
Chengxiang Fan
Hao Chen
Yongxu Liu
Weian Mao
Xiaogang Xu
Chunhua Shen
196
7
0
04 Jun 2024
GrootVL: Tree Topology is All You Need in State Space Model
GrootVL: Tree Topology is All You Need in State Space Model
Yicheng Xiao
Lin Song
Shaoli Huang
Jiangshan Wang
Siyu Song
Yixiao Ge
Xiu Li
Mingyu Ding
Mamba
237
16
0
04 Jun 2024
Prototypical Transformer as Unified Motion Learners
Prototypical Transformer as Unified Motion Learners
Cheng Han
Yawen Lu
Guohao Sun
James Liang
Zhiwen Cao
...
S. Dianat
Raghuveer M. Rao
Tong Geng
Zhiqiang Tao
Dongfang Liu
ViT
318
8
0
03 Jun 2024
On the Use of Anchoring for Training Vision Models
On the Use of Anchoring for Training Vision Models
V. Narayanaswamy
Kowshik Thopalli
Rushil Anirudh
Yamen Mubarka
W. Sakla
Jayaraman J. Thiagarajan
337
1
0
01 Jun 2024
You Only Need Less Attention at Each Stage in Vision Transformers
You Only Need Less Attention at Each Stage in Vision Transformers
Shuoxi Zhang
Hanpeng Liu
Stephen Lin
Kun He
293
16
0
01 Jun 2024
DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention
  Mechanisms in Medical Caption Generation through Concept Detection
  Integration
DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration
Nhi Ngoc-Yen Nguyen
Le-Huy Tu
Dieu-Phuong Nguyen
Nhat-Tan Do
Minh Triet Thai
Bao-Thien Nguyen-Tat
MedIm
218
3
0
01 Jun 2024
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
M. Rusanovsky
Or Hirschorn
S. Avidan
229
8
0
01 Jun 2024
YotoR-You Only Transform One Representation
YotoR-You Only Transform One Representation
José Ignacio Díaz Villa
P. Loncomilla
Javier Ruiz-del-Solar
ViT
226
1
0
30 May 2024
FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
You Huang
Zongyu Lan
Liujuan Cao
Xianming Lin
Shengchuan Zhang
Guannan Jiang
Rongrong Ji
VLM
219
6
0
29 May 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
235
7
0
28 May 2024
ViG: Linear-complexity Visual Sequence Learning with Gated Linear
  Attention
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao
Xinggang Wang
Lianghui Zhu
Qian Zhang
Chang Huang
310
9
0
28 May 2024
On Fairness of Low-Rank Adaptation of Large Models
On Fairness of Low-Rank Adaptation of Large Models
Zhoujie Ding
Katja Filippova
Pura Peetathawatchai
Berivan Isik
Sanmi Koyejo
210
7
0
27 May 2024
Building Vision Models upon Heat Conduction
Building Vision Models upon Heat Conduction
Zhaozhi Wang
Yue Liu
Yunfan Liu
Hongtian Yu
Yaowei Wang
QiXiang Ye
ViTVLM
277
4
0
26 May 2024
ModelLock: Locking Your Model With a Spell
ModelLock: Locking Your Model With a Spell
Yifeng Gao
Yuhua Sun
Jiabo He
Zuxuan Wu
Yu-Gang Jiang
VLM
273
3
0
25 May 2024
Free Performance Gain from Mixing Multiple Partially Labeled Samples in
  Multi-label Image Classification
Free Performance Gain from Mixing Multiple Partially Labeled Samples in Multi-label Image Classification
Chak Fong Chong
Jielong Guo
Xu Yang
Wei Ke
Yapeng Wang
VLM
243
0
0
24 May 2024
ArchesWeather: An efficient AI weather forecasting model at 1.5°
  resolution
ArchesWeather: An efficient AI weather forecasting model at 1.5° resolution
Guillaume Couairon
Christian Lessig
A. Charantonis
C. Monteleoni
243
5
0
23 May 2024
Scalable Visual State Space Model with Fractal Scanning
Scalable Visual State Space Model with Fractal Scanning
Lv Tang
Haoke Xiao
Peng-Tao Jiang
Hao Zhang
Jinwei Chen
Yue Liu
Mamba
286
11
0
23 May 2024
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space ModelNeural Information Processing Systems (NeurIPS), 2024
Yuheng Shi
Minjing Dong
Chang Xu
Mamba
292
78
0
23 May 2024
Configuring Data Augmentations to Reduce Variance Shift in Positional
  Embedding of Vision Transformers
Configuring Data Augmentations to Reduce Variance Shift in Positional Embedding of Vision TransformersAAAI Conference on Artificial Intelligence (AAAI), 2024
Bum Jun Kim
Sang Woo Kim
ViT
196
2
0
23 May 2024
Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
Ting Liu
Xuyang Liu
Liangtao Shi
Zunnan Xu
Yue Hu
Yi Xin
Quanjun Yin
Bineng Zhong
Donglin Wang
288
16
0
23 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and
  Extrapolate
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
336
4
0
22 May 2024
Counterfactual Gradients-based Quantification of Prediction Trust in
  Neural Networks
Counterfactual Gradients-based Quantification of Prediction Trust in Neural Networks
Mohit Prabhushankar
Ghassan AlRegib
UQCV
252
0
0
22 May 2024
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in
  Large-Scale AI Models
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models
Zhaojian Yu
Yinghao Wu
Zhuotao Deng
Yansong Tang
Jinqiang Cui
217
6
0
21 May 2024
Feature-based Federated Transfer Learning: Communication Efficiency,
  Robustness and Privacy
Feature-based Federated Transfer Learning: Communication Efficiency, Robustness and PrivacyIEEE Transactions on Machine Learning in Communications and Networking (IEEE TMLCN), 2024
Feng Wang
M. C. Gursoy
Senem Velipasalar
252
3
0
15 May 2024
Resource Efficient Perception for Vision Systems
Resource Efficient Perception for Vision Systems
M. I. A V Subramanyam
Niyati Singal
Vinay Kumar Verma
305
0
0
12 May 2024
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMambaIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Hongwei Ren
Yue Zhou
Jiadong Zhu
Haotian Fu
Yulong Huang
Xiaopeng Lin
Yuetong Fang
Fei Ma
Hao Yu
Bo-Xun Cheng
Mamba
593
18
0
09 May 2024
Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement
Retinexmamba: Retinex-based Mamba for Low-light Image EnhancementInternational Conference on Neural Information Processing (ICONIP), 2024
Jiesong Bai
Yuhao Yin
Qiyuan He
Yuanxian Li
Xiaofeng Zhang
Mamba
240
84
0
06 May 2024
Multimodal Sense-Informed Prediction of 3D Human Motions
Multimodal Sense-Informed Prediction of 3D Human MotionsComputer Vision and Pattern Recognition (CVPR), 2024
Zhenyu Lou
Qiongjie Cui
Haofan Wang
Xu Tang
Hong Zhou
228
12
0
05 May 2024
U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers
U-DiTs: Downsample Tokens in U-Shaped Diffusion TransformersNeural Information Processing Systems (NeurIPS), 2024
Yuchuan Tian
Zhijun Tu
Hanting Chen
Jie Hu
Chao Xu
Yunhe Wang
259
37
0
04 May 2024
Guided Conditional Diffusion Classifier (ConDiff) for Enhanced
  Prediction of Infection in Diabetic Foot Ulcers
Guided Conditional Diffusion Classifier (ConDiff) for Enhanced Prediction of Infection in Diabetic Foot Ulcers
Palawat Busaranuvong
Emmanuel O. Agu
Deepak Kumar
Shefalika Gautam
Reza Saadati Fard
B. Tulu
Diane Strong
MedIm
159
2
0
01 May 2024
Analyzing and Exploring Training Recipes for Large-Scale
  Transformer-Based Weather Prediction
Analyzing and Exploring Training Recipes for Large-Scale Transformer-Based Weather Prediction
Jared Willard
Peter Harrington
Shashank Subramanian
Ankur Mahesh
Travis A. O'Brien
William D. Collins
AI4TS
278
14
0
30 Apr 2024
Large Language Model Informed Patent Image Retrieval
Large Language Model Informed Patent Image Retrieval
Hao-Cheng Lo
Jung-Mei Chu
Jieh Hsiang
Chun-Chieh Cho
VLM
240
9
0
30 Apr 2024
Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing
Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing
Leonardo Rossi
Vittorio Bernuzzi
Tomaso Fontanini
Massimo Bertozzi
Andrea Prati
191
7
0
29 Apr 2024
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data
Yiyuan Yang
Ming Jin
Haomin Wen
Chaoli Zhang
Yuxuan Liang
...
Cheng-Ming Liu
Bin Yang
Zenglin Xu
Jiang Bian
Shirui Pan
AI4TSDiffMSyDa
588
88
0
29 Apr 2024
HIPer: A Human-Inspired Scene Perception Model for Multifunctional
  Mobile Robots
HIPer: A Human-Inspired Scene Perception Model for Multifunctional Mobile Robots
Florenz Graf
Jochen Lindermayr
Birgit Graf
Werner Kraus
Marco F. Huber
215
3
0
27 Apr 2024
Previous
123...678...171819
Next
Page 7 of 19
Pageof 19