ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXivPDFHTML

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 821 papers shown
Title
FIPER: Generalizable Factorized Features for Robust Low-Level Vision Models
FIPER: Generalizable Factorized Features for Robust Low-Level Vision Models
Yang-Che Sun
Cheng Yu Yeo
Ernie Chu
Jun-Cheng Chen
Yu-Lun Liu
SupR
28
0
0
23 Oct 2024
LoRA-C: Parameter-Efficient Fine-Tuning of Robust CNN for IoT Devices
LoRA-C: Parameter-Efficient Fine-Tuning of Robust CNN for IoT Devices
Chuntao Ding
Xu Cao
Jianhang Xie
Linlin Fan
Shangguang Wang
Zhichao Lu
24
1
0
22 Oct 2024
Test-time Adversarial Defense with Opposite Adversarial Path and High
  Attack Time Cost
Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost
Cheng-Han Yeh
Kuanchun Yu
Chun-Shien Lu
DiffM
AAML
33
0
0
22 Oct 2024
Are Large-scale Soft Labels Necessary for Large-scale Dataset
  Distillation?
Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation?
Lingao Xiao
Yang He
DD
21
5
0
21 Oct 2024
D-SarcNet: A Dual-stream Deep Learning Framework for Automatic Analysis
  of Sarcomere Structures in Fluorescently Labeled hiPSC-CMs
D-SarcNet: A Dual-stream Deep Learning Framework for Automatic Analysis of Sarcomere Structures in Fluorescently Labeled hiPSC-CMs
Huyen Le
Khiet Dang
N. H. Nguyen
Mai Tran
Hieu Pham
16
0
0
19 Oct 2024
Towards Zero-Shot Camera Trap Image Categorization
Towards Zero-Shot Camera Trap Image Categorization
Jiří Vyskočil
Lukas Picek
VLM
18
0
0
16 Oct 2024
Transformer based super-resolution downscaling for regional reanalysis:
  Full domain vs tiling approaches
Transformer based super-resolution downscaling for regional reanalysis: Full domain vs tiling approaches
Antonio Pérez
Mario Santa Cruz
Daniel San Martín
José Manuel Gutiérrez
18
0
0
16 Oct 2024
Hespi: A pipeline for automatically detecting information from hebarium
  specimen sheets
Hespi: A pipeline for automatically detecting information from hebarium specimen sheets
Robert Turnbull
Emily Fitzgerald
Karen Thompson
Joanne L. Birch
18
0
0
11 Oct 2024
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing
  Attention
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
Nguyen Huu Bao Long
Chenyu Zhang
Yuzhi Shi
Tsubasa Hirakawa
Takayoshi Yamashita
Tohgoroh Matsui
H. Fujiyoshi
21
2
0
11 Oct 2024
HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point
  Cloud Planar Projections
HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point Cloud Planar Projections
Jiaxing Hao
Yanxi Wang
Zhigang Chang
Hongmin Gao
Zihao Cheng
Chen Wu
Xin Zhao
Peiye Fang
Rachmat Muwardi
ViT
21
0
0
11 Oct 2024
When Graph meets Multimodal: Benchmarking and Meditating on Multimodal Attributed Graphs Learning
When Graph meets Multimodal: Benchmarking and Meditating on Multimodal Attributed Graphs Learning
Hao Yan
C. Li
Zhigang Yu
Jun Yin
Ruochen Liu
Peiyan Zhang
Weihao Han
Mingzheng Li
Zhengxin Zeng
17
0
0
11 Oct 2024
IceDiff: High Resolution and High-Quality Sea Ice Forecasting with
  Generative Diffusion Prior
IceDiff: High Resolution and High-Quality Sea Ice Forecasting with Generative Diffusion Prior
Jingyi Xu
Siwei Tu
Weidong Yang
Shuhao Li
Keyi Liu
Yeqi Luo
Lipeng Ma
Ben Fei
Lei Bai
DiffM
AI4Cl
34
1
0
10 Oct 2024
Iterative Optimization Annotation Pipeline and ALSS-YOLO-Seg for
  Efficient Banana Plantation Segmentation in UAV Imagery
Iterative Optimization Annotation Pipeline and ALSS-YOLO-Seg for Efficient Banana Plantation Segmentation in UAV Imagery
Ang He
Ximei Wu
Xing Xu
Jing Chen
Xiaobin Guo
Sheng Xu
13
0
0
09 Oct 2024
CALoR: Towards Comprehensive Model Inversion Defense
CALoR: Towards Comprehensive Model Inversion Defense
Hongyao Yu
Yixiang Qiu
Hao Fang
Bin Chen
Sijin Yu
Bin Wang
Shu-Tao Xia
Ke Xu
19
1
0
08 Oct 2024
GLRT-Based Metric Learning for Remote Sensing Object Retrieval
GLRT-Based Metric Learning for Remote Sensing Object Retrieval
Linping Zhang
Yu Liu
Xueqian Wang
Gang Li
You He
20
0
0
08 Oct 2024
Guided Self-attention: Find the Generalized Necessarily Distinct Vectors
  for Grain Size Grading
Guided Self-attention: Find the Generalized Necessarily Distinct Vectors for Grain Size Grading
Fang Gao
XueTao Li
Jiabao Wang
Shengheng Ma
Jun Yu
15
0
0
08 Oct 2024
MetaDD: Boosting Dataset Distillation with Neural Network
  Architecture-Invariant Generalization
MetaDD: Boosting Dataset Distillation with Neural Network Architecture-Invariant Generalization
Yunlong Zhao
Xiaoheng Deng
Xiu Su
Hongyan Xu
Xiuxing Li
Yijing Liu
Shan You
FedML
DD
24
1
0
07 Oct 2024
Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations
  at Test-Time
Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations at Test-Time
Chiao-An Yang
Ziwei Liu
Raymond A. Yeh
20
1
0
01 Oct 2024
CBAM-SwinT-BL: Small Rail Surface Defect Detection Method Based on Swin
  Transformer with Block Level CBAM Enhancement
CBAM-SwinT-BL: Small Rail Surface Defect Detection Method Based on Swin Transformer with Block Level CBAM Enhancement
Jiayi Zhao
Alison Wun-lam Yeung
Ali Muhammad
Songjiang Lai
Vincent To-Yee NG
14
2
0
30 Sep 2024
Universal Medical Image Representation Learning with Compositional
  Decoders
Universal Medical Image Representation Learning with Compositional Decoders
Kaini Wang
Ling Yang
Siping Zhou
Guangquan Zhou
Wentao Zhang
Bin Cui
Shuo Li
SSL
MedIm
28
0
0
30 Sep 2024
All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path
  Aggregation
All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path Aggregation
Xu Zhang
Peiyao Guo
Ming-Tse Lu
Zhan Ma
28
2
0
29 Sep 2024
Exploring Token Pruning in Vision State Space Models
Exploring Token Pruning in Vision State Space Models
Zheng Zhan
Zhenglun Kong
Yifan Gong
Yushu Wu
Zichong Meng
...
Xuan Shen
Stratis Ioannidis
Wei Niu
Pu Zhao
Yanzhi Wang
20
9
0
27 Sep 2024
Cottention: Linear Transformers With Cosine Attention
Cottention: Linear Transformers With Cosine Attention
Gabriel Mongaras
Trevor Dohm
Eric C. Larson
21
0
0
27 Sep 2024
HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting
HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting
Nian Ran
Peng Xiao
Yue Wang
Wesley Shi
Jianxin Lin
Qi Meng
Richard Allmendinger
AI4Cl
37
0
0
27 Sep 2024
MALPOLON: A Framework for Deep Species Distribution Modeling
MALPOLON: A Framework for Deep Species Distribution Modeling
Théo Larcher
Lukás Picek
Benjamin Deneu
Titouan Lorieul
Maximilien Servajean
Alexis Joly
GP
9
0
0
26 Sep 2024
HydraViT: Stacking Heads for a Scalable ViT
HydraViT: Stacking Heads for a Scalable ViT
Janek Haberer
A. Hojjat
Olaf Landsiedel
19
0
0
26 Sep 2024
TSCLIP: Robust CLIP Fine-Tuning for Worldwide Cross-Regional Traffic Sign Recognition
TSCLIP: Robust CLIP Fine-Tuning for Worldwide Cross-Regional Traffic Sign Recognition
Guoyang Zhao
Fulong Ma
Weiqing Qi
Chenguang Zhang
Yuxuan Liu
Ming Liu
Jun Ma
VLM
CLIP
38
3
0
23 Sep 2024
Fake It till You Make It: Curricular Dynamic Forgery Augmentations
  towards General Deepfake Detection
Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection
Yuzhen Lin
Wentang Song
Bin Li
Yuezun Li
Jiangqun Ni
Han Chen
Qiushi Li
23
12
0
22 Sep 2024
Multi-OCT-SelfNet: Integrating Self-Supervised Learning with
  Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease
  Classification
Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification
Fatema Jannat
Sina Gholami
Jennifer I. Lim
Theodore Leng
Minhaj Nur Alam
Hamed Tabkhi
23
0
0
17 Sep 2024
InfoDisent: Explainability of Image Classification Models by Information Disentanglement
InfoDisent: Explainability of Image Classification Models by Information Disentanglement
Łukasz Struski
Dawid Rymarczyk
Jacek Tabor
46
0
0
16 Sep 2024
GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion
GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion
Vitor Campagnolo Guizilini
P. Tokmakov
Achal Dave
Rares Ambrus
DiffM
23
2
0
15 Sep 2024
LACOSTE: Exploiting stereo and temporal contexts for surgical instrument
  segmentation
LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation
Qiyuan Wang
Shang Zhao
Zikang Xu
S Kevin Zhou
21
0
0
14 Sep 2024
PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion
  Preimage
PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage
Denis Zavadski
Damjan Kalšan
Carsten Rother
DiffM
MDE
20
5
0
13 Sep 2024
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Ling Xing
Hongyu Qu
Rui Yan
Xiangbo Shu
Jinhui Tang
45
0
0
12 Sep 2024
Inf-MLLM: Efficient Streaming Inference of Multimodal Large Language
  Models on a Single GPU
Inf-MLLM: Efficient Streaming Inference of Multimodal Large Language Models on a Single GPU
Zhenyu Ning
Jieru Zhao
Qihao Jin
Wenchao Ding
Minyi Guo
14
5
0
11 Sep 2024
EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation
EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation
Nischal Khanal
Shivanand Venkanna Sheshappanavar
MDE
26
0
0
10 Sep 2024
Renormalized Connection for Scale-preferred Object Detection in
  Satellite Imagery
Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery
Fan Zhang
Lingling Li
Licheng Jiao
Xu Liu
Fang Liu
Shuyuan Yang
B. Hou
ObjD
26
0
0
09 Sep 2024
UNIT: Unifying Image and Text Recognition in One Vision Encoder
UNIT: Unifying Image and Text Recognition in One Vision Encoder
Yi Zhu
Yanpeng Zhou
Chunwei Wang
Yang Cao
Jianhua Han
Lu Hou
Hang Xu
ViT
VLM
27
4
0
06 Sep 2024
SDformerFlow: Spatiotemporal swin spikeformer for event-based optical
  flow estimation
SDformerFlow: Spatiotemporal swin spikeformer for event-based optical flow estimation
Yi Tian
Juan Andrade-Cetto
27
0
0
06 Sep 2024
iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation
iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation
Hayeon Jo
Hyesong Choi
Minhee Cho
Dongbo Min
29
1
0
04 Sep 2024
Think Twice Before Recognizing: Large Multimodal Models for General
  Fine-grained Traffic Sign Recognition
Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition
Yaozong Gan
Guang Li
Ren Togo
Keisuke Maeda
Takahiro Ogawa
Miki Haseyama
37
0
0
03 Sep 2024
SOOD-ImageNet: a Large-Scale Dataset for Semantic Out-Of-Distribution
  Image Classification and Semantic Segmentation
SOOD-ImageNet: a Large-Scale Dataset for Semantic Out-Of-Distribution Image Classification and Semantic Segmentation
Alberto Bacchin
Davide Allegro
Stefano Ghidoni
Emanuele Menegatti
28
1
0
02 Sep 2024
A Simple and Generalist Approach for Panoptic Segmentation
A Simple and Generalist Approach for Panoptic Segmentation
Nedyalko Prisadnikov
Wouter Van Gansbeke
Danda Pani Paudel
Luc Van Gool
VLM
35
0
0
29 Aug 2024
A Review of Transformer-Based Models for Computer Vision Tasks:
  Capturing Global Context and Spatial Relationships
A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships
Gracile Astlin Pereira
Muhammad Hussain
ViT
25
7
0
27 Aug 2024
Sapiens: Foundation for Human Vision Models
Sapiens: Foundation for Human Vision Models
Rawal Khirodkar
Timur M. Bagautdinov
Julieta Martinez
Su Zhaoen
Austin James
Peter Selednik
Stuart Anderson
Shunsuke Saito
VLM
31
63
0
22 Aug 2024
HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image
  Segmentation
HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation
Mingya Zhang
Zhihao Chen
Yiyuan Ge
Xianping Tao
Mamba
50
3
0
21 Aug 2024
MsMemoryGAN: A Multi-scale Memory GAN for Palm-vein Adversarial
  Purification
MsMemoryGAN: A Multi-scale Memory GAN for Palm-vein Adversarial Purification
Huafeng Qin
Yuming Fu
Huiyan Zhang
M. El-Yacoubi
Xinbo Gao
Qun Song
Jun Wang
GAN
AAML
16
0
0
20 Aug 2024
Flatten: Video Action Recognition is an Image Classification task
Flatten: Video Action Recognition is an Image Classification task
Junlin Chen
Chengcheng Xu
Yangfan Xu
Jian Yang
Jun Yu Li
Zhiping Shi
16
1
0
17 Aug 2024
Focus on Focus: Focus-oriented Representation Learning and Multi-view
  Cross-modal Alignment for Glioma Grading
Focus on Focus: Focus-oriented Representation Learning and Multi-view Cross-modal Alignment for Glioma Grading
Li Pan
Yupei Zhang
Qiushi Yang
Tan Li
Xiaohan Xing
Maximus C. F. Yeung
Zhen Chen
25
1
0
16 Aug 2024
5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual
  Recognition Tasks
5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks
Dongshuo Yin
Leiyi Hu
Bin Li
Youqun Zhang
Xue Yang
24
6
0
15 Aug 2024
Previous
123456...151617
Next