ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXivPDFHTML

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 821 papers shown
Title
Heavy Labels Out! Dataset Distillation with Label Space Lightening
Heavy Labels Out! Dataset Distillation with Label Space Lightening
Ruonan Yu
Songhua Liu
Zigeng Chen
Jingwen Ye
Xinchao Wang
DD
24
1
0
15 Aug 2024
GRFormer: Grouped Residual Self-Attention for Lightweight Single Image
  Super-Resolution
GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution
Yuzhen Li
Zehang Deng
Yuxin Cao
Lihua Liu
16
1
0
14 Aug 2024
Advanced Vision Transformers and Open-Set Learning for Robust Mosquito
  Classification: A Novel Approach to Entomological Studies
Advanced Vision Transformers and Open-Set Learning for Robust Mosquito Classification: A Novel Approach to Entomological Studies
Ahmed Akib Jawad Karim
Muhammad Zawad Mahmud
Riasat Khan
16
0
0
12 Aug 2024
MetMamba: Regional Weather Forecasting with Spatial-Temporal Mamba Model
MetMamba: Regional Weather Forecasting with Spatial-Temporal Mamba Model
Haoyu Qin
Yungang Chen
Qianchuan Jiang
Pengchao Sun
Xiancai Ye
Chao Lin
Mamba
AI4CE
23
1
0
12 Aug 2024
Multi-scale Contrastive Adaptor Learning for Segmenting Anything in
  Underperformed Scenes
Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes
Ke Zhou
Zhongwei Qiu
Dongmei Fu
VLM
27
1
0
12 Aug 2024
Enhancing 3D Transformer Segmentation Model for Medical Image with
  Token-level Representation Learning
Enhancing 3D Transformer Segmentation Model for Medical Image with Token-level Representation Learning
Xinrong Hu
Dewen Zeng
Yawen Wu
Xueyang Li
Yiyu Shi
ViT
MedIm
31
0
0
12 Aug 2024
Beyond the Eye: A Relational Model for Early Dementia Detection Using Retinal OCTA Images
Beyond the Eye: A Relational Model for Early Dementia Detection Using Retinal OCTA Images
Shouyue Liu
Jinkui Hao
Yuanyuan Gu
Huazhu Fu
Xinyu Guo
Shuting Zhang
Yitian Zhao
Hong Song
Shuting Zhang
Yitian Zhao
14
0
0
09 Aug 2024
Efficient and Accurate Pneumonia Detection Using a Novel Multi-Scale Transformer Approach
Efficient and Accurate Pneumonia Detection Using a Novel Multi-Scale Transformer Approach
Alireza Saber
Pouria Parhami
Alimihammad Siahkarzadeh
Amirreza Fateh
Amirreza Fateh
ViT
MedIm
40
9
0
08 Aug 2024
What Happens Without Background? Constructing Foreground-Only Data for
  Fine-Grained Tasks
What Happens Without Background? Constructing Foreground-Only Data for Fine-Grained Tasks
Yuetian Wang
W. Hou
Qinmu Peng
Xinge You
14
0
0
04 Aug 2024
LAM3D: Leveraging Attention for Monocular 3D Object Detection
LAM3D: Leveraging Attention for Monocular 3D Object Detection
Diana-Alexandra Sas
Leandro Di Bella
Yangxintong Lyu
F. Oniga
Adrian Munteanu
20
0
0
03 Aug 2024
NVC-1B: A Large Neural Video Coding Model
NVC-1B: A Large Neural Video Coding Model
Xihua Sheng
Chuanbo Tang
Li Li
Dong Liu
Feng Wu
3DV
VLM
33
2
0
28 Jul 2024
Sparse Refinement for Efficient High-Resolution Semantic Segmentation
Sparse Refinement for Efficient High-Resolution Semantic Segmentation
Zhijian Liu
Zhuoyang Zhang
Samir Khaki
Shang Yang
Haotian Tang
Chenfeng Xu
Kurt Keutzer
Song Han
SSeg
29
1
0
26 Jul 2024
VSSD: Vision Mamba with Non-Causal State Space Duality
VSSD: Vision Mamba with Non-Causal State Space Duality
Yuheng Shi
Minjing Dong
Mingjia Li
Chang Xu
Mamba
28
3
0
26 Jul 2024
HybridDepth: Robust Depth Fusion for Mobile AR by Leveraging Depth from
  Focus and Single-Image Priors
HybridDepth: Robust Depth Fusion for Mobile AR by Leveraging Depth from Focus and Single-Image Priors
Ashkan Ganj
Hang Su
Tian Guo
MDE
19
0
0
26 Jul 2024
Towards the Spectral bias Alleviation by Normalizations in Coordinate
  Networks
Towards the Spectral bias Alleviation by Normalizations in Coordinate Networks
Zhicheng Cai
Hao Zhu
Qiu Shen
Xinran Wang
Xun Cao
14
0
0
25 Jul 2024
Embedding-Free Transformer with Inference Spatial Reduction for
  Efficient Semantic Segmentation
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu
Yubin Cho
Beoungwoo Kang
Seunghun Moon
Kyeongbo Kong
Suk-Ju Kang
21
2
0
24 Jul 2024
Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting
  Recognition
Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition
Gagan Bhatia
El Moatez Billah Nagoudi
Fakhraddin Alwajih
Muhammad Abdul-Mageed
24
3
0
18 Jul 2024
UCIP: A Universal Framework for Compressed Image Super-Resolution using
  Dynamic Prompt
UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt
Xin Li
Bingchen Li
Yeying Jin
Cuiling Lan
Hanxin Zhu
Yulin Ren
Zhibo Chen
39
1
0
18 Jul 2024
GroupMamba: Efficient Group-Based Visual State Space Model
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
51
0
0
18 Jul 2024
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich
Niv Nayman
Sharon Fogel
I. Lavi
Ron Litman
Shahar Tsiper
Royee Tichauer
Srikar Appalaraju
Shai Mazor
R. Manmatha
VLM
25
3
0
17 Jul 2024
MapDistill: Boosting Efficient Camera-based HD Map Construction via
  Camera-LiDAR Fusion Model Distillation
MapDistill: Boosting Efficient Camera-based HD Map Construction via Camera-LiDAR Fusion Model Distillation
Xiaoshuai Hao
Ruikai Li
Hui Zhang
Dingzhe Li
Rong Yin
Sangil Jung
Seungsang Park
ByungIn Yoo
Haimei Zhao
Jing Zhang
33
7
0
16 Jul 2024
Centering the Value of Every Modality: Towards Efficient and Resilient
  Modality-agnostic Semantic Segmentation
Centering the Value of Every Modality: Towards Efficient and Resilient Modality-agnostic Semantic Segmentation
Xueye Zheng
Yuanhuiyi Lyu
Jiazhou Zhou
Lin Wang
27
7
0
16 Jul 2024
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
Hao Ding
Tuxun Lu
Yuqian Zhang
Ruixing Liang
Hongchao Shu
...
Bo Wang
Marcos Fernández-Rodríguez
Estevao Lima
João L. Vilaça
Mathias Unberath
53
4
0
16 Jul 2024
Backdoor Attacks against Image-to-Image Networks
Backdoor Attacks against Image-to-Image Networks
Wenbo Jiang
Hongwei Li
Jiaming He
Rui Zhang
Guowen Xu
Tianwei Zhang
Rongxing Lu
AAML
33
2
0
15 Jul 2024
Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning
  Mamba
Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba
Haoye Dong
Aviral Chharia
Wenbo Gou
Francisco Vicente Carrasco
Fernando De la Torre
Mamba
40
1
0
12 Jul 2024
Data Adaptive Traceback for Vision-Language Foundation Models in Image
  Classification
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification
Wenshuo Peng
Kaipeng Zhang
Yue Yang
Hao Zhang
Yu Qiao
VLM
17
2
0
11 Jul 2024
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa
Qin Shiyin
Abdulrahman Hamman Adama Chukkol
Isah Bello
A. Lawan
Isah Bello
39
3
0
10 Jul 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
33
54
0
10 Jul 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
27
1
0
10 Jul 2024
CTRL-F: Pairing Convolution with Transformer for Image Classification
  via Multi-Level Feature Cross-Attention and Representation Learning Fusion
CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion
Hosam S. El-Assiouti
Hadeer El-Saadawy
M. Al-Berry
M. Tolba
ViT
47
0
0
09 Jul 2024
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
Xiang Zhang
Yulun Zhang
Fisher Yu
27
15
0
08 Jul 2024
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han
Qifan Wang
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Yi Fang
Qiang Guan
Lifu Huang
Dongfang Liu
VLM
27
4
0
05 Jul 2024
Semantically Guided Representation Learning For Action Anticipation
Semantically Guided Representation Learning For Action Anticipation
Anxhelo Diko
D. Avola
Bardh Prenkaj
Federico Fontana
Luigi Cinque
AI4TS
41
2
0
02 Jul 2024
Vision Mamba-based autonomous crack segmentation on concrete, asphalt,
  and masonry surfaces
Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces
Zhaohui Chen
Elyas Asadi Shamsabadi
Sheng Jiang
Luming Shen
Daniel Dias-da-Costa
Mamba
35
3
0
24 Jun 2024
Rethinking Remote Sensing Change Detection With A Mask View
Rethinking Remote Sensing Change Detection With A Mask View
Xiaowen Ma
Zhenkai Wu
Rongrong Lian
Wei Zhang
Siyang Song
19
3
0
21 Jun 2024
Is AI fun? HumorDB: a curated dataset and benchmark to investigate
  graphical humor
Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor
Veedant Jain
Felipe dos Santos Alves Feitosa
Gabriel Kreiman
VLM
33
2
0
19 Jun 2024
ChangeViT: Unleashing Plain Vision Transformers for Change Detection
ChangeViT: Unleashing Plain Vision Transformers for Change Detection
Duowang Zhu
Xiaohu Huang
Haiyan Huang
Zhenfeng Shao
Q. Cheng
18
7
0
18 Jun 2024
Demonstrating Agile Flight from Pixels without State Estimation
Demonstrating Agile Flight from Pixels without State Estimation
Ismail Geles
L. Bauersfeld
Angel Romero
Jiaxu Xing
Davide Scaramuzza
32
22
0
18 Jun 2024
Is Your HD Map Constructor Reliable under Sensor Corruptions?
Is Your HD Map Constructor Reliable under Sensor Corruptions?
Xiaoshuai Hao
Mengchuan Wei
Yifan Yang
Haimei Zhao
Hui Zhang
Yi Zhou
Qiang Wang
Weiming Li
Lingdong Kong
Jing Zhang
3DV
42
8
0
18 Jun 2024
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
Di Wang
Meiqi Hu
Yao Jin
Yuchun Miao
Jiaqi Yang
...
Lefei Zhang
Chen Wu
Bo Du
Dacheng Tao
Liangpei Zhang
59
19
0
17 Jun 2024
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic
  Segmentation with Plain Vision Transformers
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Narges Norouzi
Svetlana Orlova
Daan de Geus
Gijs Dubbelman
ViT
FedML
36
3
0
14 Jun 2024
LieRE: Generalizing Rotary Position Encodings
LieRE: Generalizing Rotary Position Encodings
Sophie Ostmeier
Brian Axelrod
Michael E. Moseley
Akshay S. Chaudhari
C. Langlotz
16
1
0
14 Jun 2024
Depth Anything V2
Depth Anything V2
Lihe Yang
Bingyi Kang
Zilong Huang
Zhen Zhao
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
DiffM
VLM
MDE
59
314
0
13 Jun 2024
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks
  and Algorithms
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Miaosen Zhang
Yixuan Wei
Zhen Xing
Yifei Ma
Zuxuan Wu
...
Zheng-Wei Zhang
Qi Dai
Chong Luo
Xin Geng
Baining Guo
VLM
33
1
0
13 Jun 2024
Unveiling Incomplete Modality Brain Tumor Segmentation: Leveraging
  Masked Predicted Auto-Encoder and Divergence Learning
Unveiling Incomplete Modality Brain Tumor Segmentation: Leveraging Masked Predicted Auto-Encoder and Divergence Learning
Zhongao Sun
Jiameng Li
Yuhan Wang
Jiarong Cheng
Qing Zhou
Chun Li
MedIm
20
0
0
12 Jun 2024
ProTrain: Efficient LLM Training via Memory-Aware Techniques
ProTrain: Efficient LLM Training via Memory-Aware Techniques
Hanmei Yang
Jin Zhou
Yao Fu
Xiaoqun Wang
Ramine Roane
Hui Guan
Tongping Liu
VLM
26
0
0
12 Jun 2024
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision
  Transformer
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu
Tong Zhang
Sabine Süsstrunk
ViT
21
0
0
12 Jun 2024
A Robust Pipeline for Classification and Detection of Bleeding Frames in
  Wireless Capsule Endoscopy using Swin Transformer and RT-DETR
A Robust Pipeline for Classification and Detection of Bleeding Frames in Wireless Capsule Endoscopy using Swin Transformer and RT-DETR
Sasidhar Alavala
Anil Kumar Vadde
Aparnamala Kancheti
Subrahmanyam Gorthi
ViT
MedIm
18
2
0
12 Jun 2024
Towards Fundamentally Scalable Model Selection: Asymptotically Fast
  Update and Selection
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection
Wenxiao Wang
Weiming Zhuang
Lingjuan Lyu
27
0
0
11 Jun 2024
ReduceFormer: Attention with Tensor Reduction by Summation
ReduceFormer: Attention with Tensor Reduction by Summation
John Yang
Le An
Su Inn Park
18
0
0
11 Jun 2024
Previous
12345...151617
Next