Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.14030
Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"
50 / 1,659 papers shown
Title
Ground Awareness in Deep Learning for Large Outdoor Point Cloud Segmentation
Kevin Qiu
Dimitri Bulatov
Dorota Iwaszczuk
3DPC
52
0
0
30 Jan 2025
V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection
Sichao Wang
Chuang Zhang
Ming Yuan
Qing Xu
Lei He
Jianqiang Wang
47
1
0
28 Jan 2025
Prion-ViT: Prions-Inspired Vision Transformers for Temperature prediction with Specklegrams
Abhishek Sebastian
Pragna R
Sonaa Rajagopal
Muralikrishnan Mani
53
0
0
28 Jan 2025
State-space models are accurate and efficient neural operators for dynamical systems
Zheyuan Hu
Nazanin Ahmadi Daryakenari
Qianli Shen
Kenji Kawaguchi
George Karniadakis
Mamba
AI4CE
64
10
0
28 Jan 2025
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Xiangyu Gao
Yu Dai
Benliu Qiu
Hongliang Li
Heqian Qiu
Hongliang Li
ObjD
VLM
76
0
0
28 Jan 2025
MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis
Mai A. Shaaban
Adnan Khan
Mohammad Yaqub
LM&MA
78
2
0
28 Jan 2025
Collective Intelligence for 2D Push Manipulations with Mobile Robots
So Kuroki
T. Matsushima
Jumpei Arima
Hiroki Furuta
Yutaka Matsuo
S. Gu
Yujin Tang
61
5
0
28 Jan 2025
MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation
Qi Wang
Yuan Mi
H. Wang
Yi Zhang
Ruizhi Chengze
Hongsheng Liu
J. Wen
Hao Sun
AI4CE
35
0
0
28 Jan 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
67
0
0
26 Jan 2025
Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
Jiajie Li
Brian R Quaranto
Chenhui Xu
Ishan Mishra
Ruiyang Qin
Dancheng Liu
Peter C W Kim
Jinjun Xiong
83
0
0
25 Jan 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
61
0
0
25 Jan 2025
Rethinking Encoder-Decoder Flow Through Shared Structures
Frederik Laboyrie
M. K. Yucel
Albert Saà-Garriga
AI4CE
40
0
0
24 Jan 2025
Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi
Ella Koresh
Ronit D. Gross
Yuval Meir
Yarden Tzach
Tal Halevi
Ido Kanter
ViT
41
0
0
22 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
69
0
0
21 Jan 2025
Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2
Md. Rakibul Islam
Md. Zahid Hossain
Mustofa Ahmed
Most. Sharmin Sultana Samu
LM&MA
MedIm
35
0
0
21 Jan 2025
Scalable Whole Slide Image Representation Using K-Mean Clustering and Fisher Vector Aggregation
R. Gupta
Shounak Das
Ardhendu Sekhar
Amit Sethi
24
0
0
21 Jan 2025
TFLOP: Table Structure Recognition Framework with Layout Pointer Mechanism
Minsoo Khang
Teakgyu Hong
LMTD
94
0
0
21 Jan 2025
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
Branislava Jankovic
Sabina Jangirova
Waseem Ullah
Latif U. Khan
Mohsen Guizani
29
0
0
21 Jan 2025
A generalizable 3D framework and model for self-supervised learning in medical imaging
Tony Xu
Sepehr Hosseini
Chris Anderson
Anthony Rinaldi
Rahul G. Krishnan
Anne L. Martel
Maged Goubran
MedIm
29
3
0
20 Jan 2025
NeurOp-Diff:Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion
Zihao Xu
Yuzhi Tang
Bowen Xu
Qingquan Li
DiffM
55
0
0
20 Jan 2025
Elucidating the Design Space of Dataset Condensation
Shitong Shao
Zikai Zhou
Huanran Chen
Zhiqiang Shen
DD
54
7
0
20 Jan 2025
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
R. Yasarla
H. Cai
Jisoo Jeong
Y. Shi
Risheek Garrepalli
Fatih Porikli
MDE
63
16
0
17 Jan 2025
Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution
Cuixin Yang
Rongkang Dong
Jun Xiao
Cong Zhang
Kin-Man Lam
Fei Zhou
Guoping Qiu
81
1
0
17 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
95
17
0
17 Jan 2025
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
R. Yasarla
Manish Kumar Singh
Hong Cai
Yunxiao Shi
Jisoo Jeong
Yinhao Zhu
Shizhong Han
Risheek Garrepalli
Fatih Porikli
MDE
80
6
0
17 Jan 2025
Self Pre-training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging
B. K. Das
Gengyan Zhao
Han Liu
Thomas J. Re
D. Comaniciu
Eli Gibson
Andreas K. Maier
ViT
MedIm
47
1
0
15 Jan 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Yunzhi Zhuge
Hongyu Gu
Lu Zhang
Jinqing Qi
Huchuan Lu
VOS
62
2
0
14 Jan 2025
Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Xianping Ma
Ziyao Wang
Yin Hu
Xiaokang Zhang
Man-On Pun
46
0
0
13 Jan 2025
MathReader : Text-to-Speech for Mathematical Documents
Sieun Hyeon
Kyudan Jung
N. Kim
Hyun Gon Ryu
Jaeyoung Do
36
1
0
13 Jan 2025
Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities
Jialin Wu
Kaikai Pan
Yanjiao Chen
Jiangyi Deng
Shengyuan Pang
Wenyuan Xu
ViT
AAML
41
0
0
13 Jan 2025
Rethinking Knowledge in Distillation: An In-context Sample Retrieval Perspective
Jinjing Zhu
Songze Li
Lin Wang
42
0
0
13 Jan 2025
CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection
Y. Li
Yang Yang
Zhen Lei
3DPC
46
2
0
11 Jan 2025
BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response
Hongruixuan Chen
Jian Song
Olivier Dietrich
Clifford Broni-Bediako
Weihao Xuan
...
Yimin Wei
J. Xia
Cuiling Lan
Konrad Schindler
Naoto Yokoya
70
5
0
10 Jan 2025
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
Arkaprava Sinha
Monish Soundar Raj
Pu Wang
Ahmed Helmy
Srijan Das
Mamba
48
3
0
10 Jan 2025
Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph
Donglin Di
Jiahui Yang
Chaofan Luo
Zhou Xue
Wei Chen
Xun Yang
Yue Gao
3DGS
52
11
0
10 Jan 2025
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
Sheng Zhang
Yanbo Xu
Naoto Usuyama
Hanwen Xu
J. Bagga
...
Carlo Bifulco
M. Lungren
Tristan Naumann
Sheng Wang
Hoifung Poon
LM&MA
MedIm
151
198
0
10 Jan 2025
MObI: Multimodal Object Inpainting Using Diffusion Models
Alexandru Buburuzan
Anuj Sharma
John Redford
P. Dokania
Romain Mueller
DiffM
83
1
0
06 Jan 2025
Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
H. Li
Xiaoyu Ren
Hongjiu Yu
Huiyu Duan
Kai Li
Ying Chen
Libo Wang
Xiongkuo Min
Guangtao Zhai
Xu Liu
CVBM
35
0
0
05 Jan 2025
SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network
Zhaoxu Li
Wei An
Gaowei Guo
Longguang Wang
Yingqian Wang
Zaiping Lin
ViT
73
0
0
03 Jan 2025
Dual Degradation-Inspired Deep Unfolding Network for Low-Light Image Enhancement
Huake Wang
Xingsong Hou
Xiaoyang Yan
Kaibing Zhang
Xiangyong Cao
Xueming Qian
90
0
0
03 Jan 2025
Keypoint Aware Masked Image Modelling
Madhava Krishna
Convin.AI
65
0
0
03 Jan 2025
Measuring Error Alignment for Decision-Making Systems
Binxia Xu
Antonis Bikakis
Daniel Onah
A. Vlachidis
Luke Dickens
34
0
0
03 Jan 2025
Double-Flow GAN model for the reconstruction of perceived faces from brain activities
Zihao Wang
Jing Zhao
Xuetong Ding
Hui Zhang
CVBM
AI4CE
21
0
0
03 Jan 2025
Causal Deep Learning
M. Alex O. Vasilescu
CML
49
2
1
03 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
91
45
0
03 Jan 2025
Advancements in Visual Language Models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques
Lijie Tao
H. Zhang
Haizhao Jing
Yu Liu
Kelu Yao
Guoting Wei
Xizhe Xue
33
0
0
03 Jan 2025
Open-Set Object Detection By Aligning Known Class Representations
Hiran Sarkar
Vishal M. Chudasama
N. Onoe
Pankaj Wasnik
Vineeth N. Balasubramanian
ObjD
44
5
0
31 Dec 2024
MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping
Amirreza Fateh
Mohammad Reza Mohammadi
Mohammad Reza Jahed Motlagh
ViT
72
5
0
31 Dec 2024
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
89
0
0
31 Dec 2024
Unlocking adaptive digital pathology through dynamic feature learning
Jiawen Li
Tian Guan
Qingxin Xia
Y. Wang
Xitong Ling
...
Xiu-Wu Bian
Z. Wang
Lingchuan Guo
Chao He
Yonghong He
AI4CE
32
0
0
31 Dec 2024
Previous
1
2
3
4
5
6
...
32
33
34
Next