Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.14030
Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"
50 / 1,659 papers shown
Title
PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement
Chengyou Jia
Minnan Luo
Zhuohang Dang
Guangwen Dai
Xiao Chang
J. Wang
DiffM
47
1
0
31 Dec 2024
Open-Set Object Detection By Aligning Known Class Representations
Hiran Sarkar
Vishal M. Chudasama
N. Onoe
Pankaj Wasnik
Vineeth N. Balasubramanian
ObjD
44
5
0
31 Dec 2024
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
106
592
0
31 Dec 2024
MetricDepth: Enhancing Monocular Depth Estimation with Deep Metric Learning
Chunpu Liu
Guanglei Yang
Wangmeng Zuo
Tianyi Zan
MDE
41
0
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
34
2
0
29 Dec 2024
DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images
Enbo Huang
Yuan Zhang
Faliang Huang
Guangyu Zhang
Y. Liu
DiffM
37
0
0
25 Dec 2024
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
Yaming Zhang
Chenqiang Gao
Fangcen Liu
Junjie Guo
Lan Wang
Xinggan Peng
Deyu Meng
87
0
0
21 Dec 2024
GG-SSMs: Graph-Generating State Space Models
Nikola Zubić
Davide Scaramuzza
Mamba
88
1
0
17 Dec 2024
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu
Meng Lou
Yizhou Yu
112
1
0
16 Dec 2024
DSRC: Learning Density-insensitive and Semantic-aware Collaborative Representation against Corruptions
Jingyu Zhang
Yilei Wang
Lang Qian
Peng Sun
Zengwen Li
Sudong Jiang
Maolin Liu
Liang Song
93
1
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
130
2
0
14 Dec 2024
Frequency-Adaptive Low-Latency Object Detection Using Events and Frames
Haitian Zhang
Xiangyuan Wang
Chang Xu
Xinya Wang
Fang Xu
Huai Yu
Lei Yu
Wen Yang
ObjD
92
0
0
05 Dec 2024
CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning
Runjian Chen
H. Zhang
Avinash Ravichandran
Wenqi Shao
Alex Wong
Ping Luo
Ping Luo
3DPC
77
0
0
04 Dec 2024
HandOS: 3D Hand Reconstruction in One Stage
Xingyu Chen
Zhuheng Song
Xiaoke Jiang
Yaoqing Hu
Junzhi Yu
Lei Zhang
3DH
HAI
69
0
0
02 Dec 2024
Auto-Encoded Supervision for Perceptual Image Super-Resolution
MinKyu Lee
Sangeek Hyun
Woojin Jun
Jae-Pil Heo
SupR
89
0
0
28 Nov 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
105
6
0
27 Nov 2024
RS-vHeat: Heat Conduction Guided Efficient Remote Sensing Foundation Model
Huiyang Hu
Peijin Wang
Hanbo Bi
Boyuan Tong
Z. Wang
...
Ziqi Zhang
QiXiang Ye
Kun Fu
Xian Sun
Xian Sun
98
0
0
27 Nov 2024
Deep End-to-end Adaptive k-Space Sampling, Reconstruction, and Registration for Dynamic MRI
George Yiasemis
J. Sonke
Jonas Teuwen
63
0
0
27 Nov 2024
Complexity Experts are Task-Discriminative Learners for Any Image Restoration
Eduard Zamfir
Zongwei Wu
Nancy Mehta
Yuedong Tan
Danda Pani Paudel
Yulun Zhang
Radu Timofte
MoE
103
1
0
27 Nov 2024
Generative Semantic Communication for Joint Image Transmission and Segmentation
Weiwen Yuan
Jinke Ren
Chongjie Wang
Ruichen Zhang
Jun Wei
Dong In Kim
Shuguang Cui
DiffM
83
0
0
27 Nov 2024
Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search
Shuyu Yang
Yaxiong Wang
Li Zhu
Zhedong Zheng
91
2
0
26 Nov 2024
GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
Sudarshan Rajagopalan
Nithin Gopalakrishnan Nair
Jay N. Paranjape
Vishal M. Patel
DiffM
90
0
0
26 Nov 2024
SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting
Gyeongjin Kang
Jisang Yoo
Jihyeon Park
Seungtae Nam
Hyeonsoo Im
Sangheon Shin
Sangpil Kim
Eunbyung Park
3DGS
117
3
0
26 Nov 2024
LAGUNA: LAnguage Guided UNsupervised Adaptation with structured spaces
Anxhelo Diko
Antonino Furnari
Luigi Cinque
G. Farinella
85
0
0
23 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
110
3
0
22 Nov 2024
MambaIRv2: Attentive State Space Restoration
Hang Guo
Yong Guo
Yaohua Zha
Yulun Zhang
W. J. Li
Tao Dai
Shu-Tao Xia
Yawei Li
Mamba
118
12
0
22 Nov 2024
Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks
Amira Guesmi
B. Ouni
Muhammad Shafique
AAML
71
0
0
22 Nov 2024
Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation
Seokil Ham
H. Kim
Sangmin Woo
Changick Kim
Mamba
115
0
0
21 Nov 2024
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
83
0
0
20 Nov 2024
Towards Million-Scale Adversarial Robustness Evaluation With Stronger Individual Attacks
Yong Xie
Weijie Zheng
Hanxun Huang
Guangnan Ye
Xingjun Ma
AAML
69
1
0
20 Nov 2024
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
Shipeng Liu
Liang Zhao
Dengfeng Chen
SSL
110
1
0
19 Nov 2024
LaVin-DiT: Large Vision Diffusion Transformer
Zhaoqing Wang
Xiaobo Xia
Runnan Chen
Dongdong Yu
Changhu Wang
M. Gong
Tongliang Liu
92
6
0
18 Nov 2024
MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild
Xi Fang
Jiankun Wang
X. Cai
Shangqian Chen
Shuwen Yang
Lin Yao
Linfeng Zhang
Guolin Ke
Linfeng Zhang
Guolin Ke
48
1
0
17 Nov 2024
C-DiffSET: Leveraging Latent Diffusion for SAR-to-EO Image Translation with Confidence-Guided Reliable Object Generation
Jeonghyeok Do
Jaehyup Lee
Munchurl Kim
DiffM
41
1
0
16 Nov 2024
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Ashim Dahal
Saydul Akbar Murad
Nick Rahimi
ViT
29
1
0
14 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
28
0
0
12 Nov 2024
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
Chika Maduabuchi
Ericmoore Jossou
Matteo Bucci
28
0
0
12 Nov 2024
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
Hao Phung
Quan Dao
T. Dao
Hoang Phan
Dimitris Metaxas
Anh Tran
Mamba
60
3
0
06 Nov 2024
Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection
Pengfei Lyu
Pak-Hei Yeung
Xiufei Cheng
Xiaosheng Yu
Chengdong Wu
Jagath C. Rajapakse
34
0
0
06 Nov 2024
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Wenhao Wang
Y. Yang
VGen
45
3
0
05 Nov 2024
Confidence Calibration of Classifiers with Many Classes
Adrien LeCoz
Stéphane Herbin
Faouzi Adjed
UQCV
33
1
0
05 Nov 2024
Specialized Foundation Models Struggle to Beat Supervised Baselines
Zongzhe Xu
Ritvik Gupta
Wenduo Cheng
Alexander Shen
Junhong Shen
Ameet Talwalkar
M. Khodak
AI4CE
38
6
0
05 Nov 2024
ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal
Xiujin Zhu
Chee-Onn Chow
Joon Huang Chuah
Mamba
40
0
0
05 Nov 2024
Context Parallelism for Scalable Million-Token Inference
Amy Yang
Jingyi Yang
Aya Ibrahim
Xinfeng Xie
Bangsheng Tang
Grigory Sizov
Jeremy Reizenstein
Jongsoo Park
Jianyu Huang
MoE
LRM
60
5
0
04 Nov 2024
FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing
Jitesh Joshi
Sos S. Agaian
Youngjun Cho
AI4TS
28
1
0
03 Nov 2024
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
Yanlong Wang
J. Xu
Fei Ma
Shao-Lun Huang
Danny Dongning Sun
Xiao-Ping Zhang
AI4TS
24
1
0
03 Nov 2024
ViT-LCA: A Neuromorphic Approach for Vision Transformers
Sanaz Mahmoodi Takaghaj
ViT
38
0
0
31 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
41
2
0
30 Oct 2024
VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation
Chika Maduabuchi
Ericmoore Jossou
Matteo Bucci
36
1
0
22 Oct 2024
S-CFE: Simple Counterfactual Explanations
Shpresim Sadiku
Moritz Wagner
Sai Ganesh Nagarajan
S. Pokutta
21
0
0
21 Oct 2024
Previous
1
2
3
...
5
6
7
...
32
33
34
Next