ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
    ViT
ArXivPDFHTML

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 1,659 papers shown
Title
Ground Awareness in Deep Learning for Large Outdoor Point Cloud Segmentation
Ground Awareness in Deep Learning for Large Outdoor Point Cloud Segmentation
Kevin Qiu
Dimitri Bulatov
Dorota Iwaszczuk
3DPC
52
0
0
30 Jan 2025
V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection
Sichao Wang
Chuang Zhang
Ming Yuan
Qing Xu
Lei He
Jianqiang Wang
47
1
0
28 Jan 2025
Prion-ViT: Prions-Inspired Vision Transformers for Temperature prediction with Specklegrams
Prion-ViT: Prions-Inspired Vision Transformers for Temperature prediction with Specklegrams
Abhishek Sebastian
Pragna R
Sonaa Rajagopal
Muralikrishnan Mani
53
0
0
28 Jan 2025
State-space models are accurate and efficient neural operators for dynamical systems
State-space models are accurate and efficient neural operators for dynamical systems
Zheyuan Hu
Nazanin Ahmadi Daryakenari
Qianli Shen
Kenji Kawaguchi
George Karniadakis
Mamba
AI4CE
64
10
0
28 Jan 2025
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Xiangyu Gao
Yu Dai
Benliu Qiu
Hongliang Li
Heqian Qiu
Hongliang Li
ObjD
VLM
76
0
0
28 Jan 2025
MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis
MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis
Mai A. Shaaban
Adnan Khan
Mohammad Yaqub
LM&MA
78
2
0
28 Jan 2025
Collective Intelligence for 2D Push Manipulations with Mobile Robots
Collective Intelligence for 2D Push Manipulations with Mobile Robots
So Kuroki
T. Matsushima
Jumpei Arima
Hiroki Furuta
Yutaka Matsuo
S. Gu
Yujin Tang
61
5
0
28 Jan 2025
MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation
Qi Wang
Yuan Mi
H. Wang
Yi Zhang
Ruizhi Chengze
Hongsheng Liu
J. Wen
Hao Sun
AI4CE
35
0
0
28 Jan 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
67
0
0
26 Jan 2025
Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
Jiajie Li
Brian R Quaranto
Chenhui Xu
Ishan Mishra
Ruiyang Qin
Dancheng Liu
Peter C W Kim
Jinjun Xiong
83
0
0
25 Jan 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
61
0
0
25 Jan 2025
Rethinking Encoder-Decoder Flow Through Shared Structures
Rethinking Encoder-Decoder Flow Through Shared Structures
Frederik Laboyrie
M. K. Yucel
Albert Saà-Garriga
AI4CE
40
0
0
24 Jan 2025
Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi
Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi
Ella Koresh
Ronit D. Gross
Yuval Meir
Yarden Tzach
Tal Halevi
Ido Kanter
ViT
41
0
0
22 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
69
0
0
21 Jan 2025
Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2
Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2
Md. Rakibul Islam
Md. Zahid Hossain
Mustofa Ahmed
Most. Sharmin Sultana Samu
LM&MA
MedIm
35
0
0
21 Jan 2025
Scalable Whole Slide Image Representation Using K-Mean Clustering and Fisher Vector Aggregation
Scalable Whole Slide Image Representation Using K-Mean Clustering and Fisher Vector Aggregation
R. Gupta
Shounak Das
Ardhendu Sekhar
Amit Sethi
24
0
0
21 Jan 2025
TFLOP: Table Structure Recognition Framework with Layout Pointer Mechanism
TFLOP: Table Structure Recognition Framework with Layout Pointer Mechanism
Minsoo Khang
Teakgyu Hong
LMTD
94
0
0
21 Jan 2025
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
Branislava Jankovic
Sabina Jangirova
Waseem Ullah
Latif U. Khan
Mohsen Guizani
29
0
0
21 Jan 2025
A generalizable 3D framework and model for self-supervised learning in medical imaging
A generalizable 3D framework and model for self-supervised learning in medical imaging
Tony Xu
Sepehr Hosseini
Chris Anderson
Anthony Rinaldi
Rahul G. Krishnan
Anne L. Martel
Maged Goubran
MedIm
29
3
0
20 Jan 2025
NeurOp-Diff:Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion
NeurOp-Diff:Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion
Zihao Xu
Yuzhi Tang
Bowen Xu
Qingquan Li
DiffM
55
0
0
20 Jan 2025
Elucidating the Design Space of Dataset Condensation
Elucidating the Design Space of Dataset Condensation
Shitong Shao
Zikai Zhou
Huanran Chen
Zhiqiang Shen
DD
54
7
0
20 Jan 2025
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
R. Yasarla
H. Cai
Jisoo Jeong
Y. Shi
Risheek Garrepalli
Fatih Porikli
MDE
63
16
0
17 Jan 2025
Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution
Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution
Cuixin Yang
Rongkang Dong
Jun Xiao
Cong Zhang
Kin-Man Lam
Fei Zhou
Guoping Qiu
81
1
0
17 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
95
17
0
17 Jan 2025
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
R. Yasarla
Manish Kumar Singh
Hong Cai
Yunxiao Shi
Jisoo Jeong
Yinhao Zhu
Shizhong Han
Risheek Garrepalli
Fatih Porikli
MDE
80
6
0
17 Jan 2025
Self Pre-training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging
Self Pre-training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging
B. K. Das
Gengyan Zhao
Han Liu
Thomas J. Re
D. Comaniciu
Eli Gibson
Andreas K. Maier
ViT
MedIm
47
1
0
15 Jan 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Yunzhi Zhuge
Hongyu Gu
Lu Zhang
Jinqing Qi
Huchuan Lu
VOS
62
2
0
14 Jan 2025
Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Xianping Ma
Ziyao Wang
Yin Hu
Xiaokang Zhang
Man-On Pun
46
0
0
13 Jan 2025
MathReader : Text-to-Speech for Mathematical Documents
MathReader : Text-to-Speech for Mathematical Documents
Sieun Hyeon
Kyudan Jung
N. Kim
Hyun Gon Ryu
Jaeyoung Do
36
1
0
13 Jan 2025
Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities
Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities
Jialin Wu
Kaikai Pan
Yanjiao Chen
Jiangyi Deng
Shengyuan Pang
Wenyuan Xu
ViT
AAML
41
0
0
13 Jan 2025
Rethinking Knowledge in Distillation: An In-context Sample Retrieval Perspective
Rethinking Knowledge in Distillation: An In-context Sample Retrieval Perspective
Jinjing Zhu
Songze Li
Lin Wang
42
0
0
13 Jan 2025
CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection
CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection
Y. Li
Yang Yang
Zhen Lei
3DPC
46
2
0
11 Jan 2025
BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response
BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response
Hongruixuan Chen
Jian Song
Olivier Dietrich
Clifford Broni-Bediako
Weihao Xuan
...
Yimin Wei
J. Xia
Cuiling Lan
Konrad Schindler
Naoto Yokoya
70
5
0
10 Jan 2025
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
Arkaprava Sinha
Monish Soundar Raj
Pu Wang
Ahmed Helmy
Srijan Das
Mamba
48
3
0
10 Jan 2025
Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph
Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph
Donglin Di
Jiahui Yang
Chaofan Luo
Zhou Xue
Wei Chen
Xun Yang
Yue Gao
3DGS
52
11
0
10 Jan 2025
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
Sheng Zhang
Yanbo Xu
Naoto Usuyama
Hanwen Xu
J. Bagga
...
Carlo Bifulco
M. Lungren
Tristan Naumann
Sheng Wang
Hoifung Poon
LM&MA
MedIm
151
198
0
10 Jan 2025
MObI: Multimodal Object Inpainting Using Diffusion Models
MObI: Multimodal Object Inpainting Using Diffusion Models
Alexandru Buburuzan
Anuj Sharma
John Redford
P. Dokania
Romain Mueller
DiffM
83
1
0
06 Jan 2025
Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
H. Li
Xiaoyu Ren
Hongjiu Yu
Huiyu Duan
Kai Li
Ying Chen
Libo Wang
Xiongkuo Min
Guangtao Zhai
Xu Liu
CVBM
35
0
0
05 Jan 2025
SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network
SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network
Zhaoxu Li
Wei An
Gaowei Guo
Longguang Wang
Yingqian Wang
Zaiping Lin
ViT
73
0
0
03 Jan 2025
Dual Degradation-Inspired Deep Unfolding Network for Low-Light Image Enhancement
Dual Degradation-Inspired Deep Unfolding Network for Low-Light Image Enhancement
Huake Wang
Xingsong Hou
Xiaoyang Yan
Kaibing Zhang
Xiangyong Cao
Xueming Qian
90
0
0
03 Jan 2025
Keypoint Aware Masked Image Modelling
Keypoint Aware Masked Image Modelling
Madhava Krishna
Convin.AI
65
0
0
03 Jan 2025
Measuring Error Alignment for Decision-Making Systems
Measuring Error Alignment for Decision-Making Systems
Binxia Xu
Antonis Bikakis
Daniel Onah
A. Vlachidis
Luke Dickens
34
0
0
03 Jan 2025
Double-Flow GAN model for the reconstruction of perceived faces from brain activities
Double-Flow GAN model for the reconstruction of perceived faces from brain activities
Zihao Wang
Jing Zhao
Xuetong Ding
Hui Zhang
CVBM
AI4CE
21
0
0
03 Jan 2025
Causal Deep Learning
Causal Deep Learning
M. Alex O. Vasilescu
CML
49
2
1
03 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
91
45
0
03 Jan 2025
Advancements in Visual Language Models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques
Advancements in Visual Language Models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques
Lijie Tao
H. Zhang
Haizhao Jing
Yu Liu
Kelu Yao
Guoting Wei
Xizhe Xue
33
0
0
03 Jan 2025
Open-Set Object Detection By Aligning Known Class Representations
Open-Set Object Detection By Aligning Known Class Representations
Hiran Sarkar
Vishal M. Chudasama
N. Onoe
Pankaj Wasnik
Vineeth N. Balasubramanian
ObjD
44
5
0
31 Dec 2024
MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping
MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping
Amirreza Fateh
Mohammad Reza Mohammadi
Mohammad Reza Jahed Motlagh
ViT
72
5
0
31 Dec 2024
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
89
0
0
31 Dec 2024
Unlocking adaptive digital pathology through dynamic feature learning
Unlocking adaptive digital pathology through dynamic feature learning
Jiawen Li
Tian Guan
Qingxin Xia
Y. Wang
Xitong Ling
...
Xiu-Wu Bian
Z. Wang
Lingchuan Guo
Chao He
Yonghong He
AI4CE
32
0
0
31 Dec 2024
Previous
123456...323334
Next