ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
    ViT
ArXivPDFHTML

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 1,659 papers shown
Title
PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement
PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement
Chengyou Jia
Minnan Luo
Zhuohang Dang
Guangwen Dai
Xiao Chang
J. Wang
DiffM
47
1
0
31 Dec 2024
Open-Set Object Detection By Aligning Known Class Representations
Open-Set Object Detection By Aligning Known Class Representations
Hiran Sarkar
Vishal M. Chudasama
N. Onoe
Pankaj Wasnik
Vineeth N. Balasubramanian
ObjD
44
5
0
31 Dec 2024
VMamba: Visual State Space Model
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
106
592
0
31 Dec 2024
MetricDepth: Enhancing Monocular Depth Estimation with Deep Metric Learning
MetricDepth: Enhancing Monocular Depth Estimation with Deep Metric Learning
Chunpu Liu
Guanglei Yang
Wangmeng Zuo
Tianyi Zan
MDE
41
0
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
34
2
0
29 Dec 2024
DRDM: A Disentangled Representations Diffusion Model for Synthesizing
  Realistic Person Images
DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images
Enbo Huang
Yuan Zhang
Faliang Huang
Guangyu Zhang
Y. Liu
DiffM
37
0
0
25 Dec 2024
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
Yaming Zhang
Chenqiang Gao
Fangcen Liu
Junjie Guo
Lan Wang
Xinggan Peng
Deyu Meng
87
0
0
21 Dec 2024
GG-SSMs: Graph-Generating State Space Models
GG-SSMs: Graph-Generating State Space Models
Nikola Zubić
Davide Scaramuzza
Mamba
88
1
0
17 Dec 2024
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu
Meng Lou
Yizhou Yu
112
1
0
16 Dec 2024
DSRC: Learning Density-insensitive and Semantic-aware Collaborative Representation against Corruptions
DSRC: Learning Density-insensitive and Semantic-aware Collaborative Representation against Corruptions
Jingyu Zhang
Yilei Wang
Lang Qian
Peng Sun
Zengwen Li
Sudong Jiang
Maolin Liu
Liang Song
93
1
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
130
2
0
14 Dec 2024
Frequency-Adaptive Low-Latency Object Detection Using Events and Frames
Frequency-Adaptive Low-Latency Object Detection Using Events and Frames
Haitian Zhang
Xiangyuan Wang
Chang Xu
Xinya Wang
Fang Xu
Huai Yu
Lei Yu
Wen Yang
ObjD
92
0
0
05 Dec 2024
CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
Runjian Chen
H. Zhang
Avinash Ravichandran
Wenqi Shao
Alex Wong
Ping Luo
Ping Luo
3DPC
77
0
0
04 Dec 2024
HandOS: 3D Hand Reconstruction in One Stage
HandOS: 3D Hand Reconstruction in One Stage
Xingyu Chen
Zhuheng Song
Xiaoke Jiang
Yaoqing Hu
Junzhi Yu
Lei Zhang
3DH
HAI
69
0
0
02 Dec 2024
Auto-Encoded Supervision for Perceptual Image Super-Resolution
MinKyu Lee
Sangeek Hyun
Woojin Jun
Jae-Pil Heo
SupR
89
0
0
28 Nov 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
105
6
0
27 Nov 2024
RS-vHeat: Heat Conduction Guided Efficient Remote Sensing Foundation Model
RS-vHeat: Heat Conduction Guided Efficient Remote Sensing Foundation Model
Huiyang Hu
Peijin Wang
Hanbo Bi
Boyuan Tong
Z. Wang
...
Ziqi Zhang
QiXiang Ye
Kun Fu
Xian Sun
Xian Sun
98
0
0
27 Nov 2024
Deep End-to-end Adaptive k-Space Sampling, Reconstruction, and Registration for Dynamic MRI
Deep End-to-end Adaptive k-Space Sampling, Reconstruction, and Registration for Dynamic MRI
George Yiasemis
J. Sonke
Jonas Teuwen
63
0
0
27 Nov 2024
Complexity Experts are Task-Discriminative Learners for Any Image Restoration
Complexity Experts are Task-Discriminative Learners for Any Image Restoration
Eduard Zamfir
Zongwei Wu
Nancy Mehta
Yuedong Tan
Danda Pani Paudel
Yulun Zhang
Radu Timofte
MoE
103
1
0
27 Nov 2024
Generative Semantic Communication for Joint Image Transmission and Segmentation
Generative Semantic Communication for Joint Image Transmission and Segmentation
Weiwen Yuan
Jinke Ren
Chongjie Wang
Ruichen Zhang
Jun Wei
Dong In Kim
Shuguang Cui
DiffM
83
0
0
27 Nov 2024
Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search
Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search
Shuyu Yang
Yaxiong Wang
Li Zhu
Zhedong Zheng
91
2
0
26 Nov 2024
GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
Sudarshan Rajagopalan
Nithin Gopalakrishnan Nair
Jay N. Paranjape
Vishal M. Patel
DiffM
90
0
0
26 Nov 2024
SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting
SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting
Gyeongjin Kang
Jisang Yoo
Jihyeon Park
Seungtae Nam
Hyeonsoo Im
Sangheon Shin
Sangpil Kim
Eunbyung Park
3DGS
117
3
0
26 Nov 2024
LAGUNA: LAnguage Guided UNsupervised Adaptation with structured spaces
LAGUNA: LAnguage Guided UNsupervised Adaptation with structured spaces
Anxhelo Diko
Antonino Furnari
Luigi Cinque
G. Farinella
85
0
0
23 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
110
3
0
22 Nov 2024
MambaIRv2: Attentive State Space Restoration
MambaIRv2: Attentive State Space Restoration
Hang Guo
Yong Guo
Yaohua Zha
Yulun Zhang
W. J. Li
Tao Dai
Shu-Tao Xia
Yawei Li
Mamba
118
12
0
22 Nov 2024
Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks
Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks
Amira Guesmi
B. Ouni
Muhammad Shafique
AAML
71
0
0
22 Nov 2024
Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation
Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation
Seokil Ham
H. Kim
Sangmin Woo
Changick Kim
Mamba
115
0
0
21 Nov 2024
Principles of Visual Tokens for Efficient Video Understanding
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
83
0
0
20 Nov 2024
Towards Million-Scale Adversarial Robustness Evaluation With Stronger Individual Attacks
Towards Million-Scale Adversarial Robustness Evaluation With Stronger Individual Attacks
Yong Xie
Weijie Zheng
Hanxun Huang
Guangnan Ye
Xingjun Ma
AAML
69
1
0
20 Nov 2024
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
Shipeng Liu
Liang Zhao
Dengfeng Chen
SSL
110
1
0
19 Nov 2024
LaVin-DiT: Large Vision Diffusion Transformer
Zhaoqing Wang
Xiaobo Xia
Runnan Chen
Dongdong Yu
Changhu Wang
M. Gong
Tongliang Liu
92
6
0
18 Nov 2024
MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild
Xi Fang
Jiankun Wang
X. Cai
Shangqian Chen
Shuwen Yang
Lin Yao
Linfeng Zhang
Guolin Ke
Linfeng Zhang
Guolin Ke
48
1
0
17 Nov 2024
C-DiffSET: Leveraging Latent Diffusion for SAR-to-EO Image Translation with Confidence-Guided Reliable Object Generation
Jeonghyeok Do
Jaehyup Lee
Munchurl Kim
DiffM
41
1
0
16 Nov 2024
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Ashim Dahal
Saydul Akbar Murad
Nick Rahimi
ViT
29
1
0
14 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
28
0
0
12 Nov 2024
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
Chika Maduabuchi
Ericmoore Jossou
Matteo Bucci
28
0
0
12 Nov 2024
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
Hao Phung
Quan Dao
T. Dao
Hoang Phan
Dimitris Metaxas
Anh Tran
Mamba
60
3
0
06 Nov 2024
Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection
Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection
Pengfei Lyu
Pak-Hei Yeung
Xiufei Cheng
Xiaosheng Yu
Chengdong Wu
Jagath C. Rajapakse
34
0
0
06 Nov 2024
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for
  Image-to-Video Generation
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Wenhao Wang
Y. Yang
VGen
45
3
0
05 Nov 2024
Confidence Calibration of Classifiers with Many Classes
Confidence Calibration of Classifiers with Many Classes
Adrien LeCoz
Stéphane Herbin
Faouzi Adjed
UQCV
33
1
0
05 Nov 2024
Specialized Foundation Models Struggle to Beat Supervised Baselines
Specialized Foundation Models Struggle to Beat Supervised Baselines
Zongzhe Xu
Ritvik Gupta
Wenduo Cheng
Alexander Shen
Junhong Shen
Ameet Talwalkar
M. Khodak
AI4CE
38
6
0
05 Nov 2024
ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal
ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal
Xiujin Zhu
Chee-Onn Chow
Joon Huang Chuah
Mamba
40
0
0
05 Nov 2024
Context Parallelism for Scalable Million-Token Inference
Context Parallelism for Scalable Million-Token Inference
Amy Yang
Jingyi Yang
Aya Ibrahim
Xinfeng Xie
Bangsheng Tang
Grigory Sizov
Jeremy Reizenstein
Jongsoo Park
Jianyu Huang
MoE
LRM
60
5
0
04 Nov 2024
FactorizePhys: Matrix Factorization for Multidimensional Attention in
  Remote Physiological Sensing
FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing
Jitesh Joshi
Sos S. Agaian
Youngjun Cho
AI4TS
28
1
0
03 Nov 2024
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
Yanlong Wang
J. Xu
Fei Ma
Shao-Lun Huang
Danny Dongning Sun
Xiao-Ping Zhang
AI4TS
24
1
0
03 Nov 2024
ViT-LCA: A Neuromorphic Approach for Vision Transformers
ViT-LCA: A Neuromorphic Approach for Vision Transformers
Sanaz Mahmoodi Takaghaj
ViT
38
0
0
31 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
41
2
0
30 Oct 2024
VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation
VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation
Chika Maduabuchi
Ericmoore Jossou
Matteo Bucci
36
1
0
22 Oct 2024
S-CFE: Simple Counterfactual Explanations
S-CFE: Simple Counterfactual Explanations
Shpresim Sadiku
Moritz Wagner
Sai Ganesh Nagarajan
S. Pokutta
21
0
0
21 Oct 2024
Previous
123...567...323334
Next