Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2111.09883
Cited By
v1
v2 (latest)
Swin Transformer V2: Scaling Up Capacity and Resolution
18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (14834★)
Papers citing
"Swin Transformer V2: Scaling Up Capacity and Resolution"
50 / 932 papers shown
FEDS: Feature and Entropy-Based Distillation Strategy for Efficient Learned Image Compression
H. Fu
Jie Liang
Zhenman Fang
Jingning Han
390
0
0
09 Mar 2025
Dynamic Dictionary Learning for Remote Sensing Image Segmentation
Xuechao Zou
Yue Li
Shun Zhang
Kai Li
Shiying Wang
Pin Tao
Junliang Xing
Congyan Lang
312
3
0
09 Mar 2025
Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Flexible and Effective Paradigm
Jiebin Yan
Kangcheng Wu
Junjie Chen
Ziwen Tan
Yuming Fang
277
2
0
08 Mar 2025
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang
Guoguang Du
Xiaochuan Li
Qi Jia
Liang Jin
...
Zhenhua Guo
Yaqian Zhao
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
311
5
0
08 Mar 2025
EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images
Rohit Menon
Nils Dengler
Sicong Pan
Gokul Krishna Chenchani
Maren Bennewitz
EDL
464
0
0
06 Mar 2025
Computational Analysis of Degradation Modeling in Blind Panoramic Image Quality Assessment
Jiebin Yan
Ziwen Tan
Jiale Rao
Lei Wu
Yifan Zuo
Yuming Fang
288
1
0
05 Mar 2025
Task-Agnostic Attacks Against Vision Foundation Models
Brian Pulfer
Yury Belousov
Vitaliy Kinakh
Teddy Furon
S. Voloshynovskiy
AAML
230
0
0
05 Mar 2025
Adaptive Camera Sensor for Vision Models
International Conference on Learning Representations (ICLR), 2025
Eunsu Baek
Sunghwan Han
Taesik Gong
Hyung-Sin Kim
VLM
418
3
0
04 Mar 2025
Enhancing Retinal Vessel Segmentation Generalization via Layout-Aware Generative Modelling
Jonathan Fhima
Jan Van Eijgen
Lennert Beeckmans
Thomas Jacobs
Moti Freiman
Luis Filipe Nakayama
Ingeborg Stalmans
Chaim Baskin
Joachim A. Behar
MedIm
443
1
0
03 Mar 2025
SAR-W-MixMAE: SAR Foundation Model Training Using Backscatter Power Weighting
Ali Caglayan
Nevrez Imamoglu
T. Kouyama
432
0
0
03 Mar 2025
FLStore: Efficient Federated Learning Storage for non-training workloads
Ahmad Faraz Khan
Samuel Fountain
Ahmed M. Abdelmoniem
A. R. Butt
A. Anwar
FedML
318
0
0
01 Mar 2025
Investigating the use of terrain-following coordinates in AI-driven precipitation forecasts
Geophysical Research Letters (GRL), 2025
Yingkai Sha
John S. Schreck
William E. Chapman
David John Gagne II
324
1
0
01 Mar 2025
Robust and Efficient Writer-Independent IMU-Based Handwriting Recognition
Jindong Li
Tim Hamann
Jens Barth
Peter Kaempf
Dario Zanca
Bjoern M. Eskofier
160
0
0
28 Feb 2025
Explainable, Multi-modal Wound Infection Classification from Images Augmented with Generated Captions
Palawat Busaranuvong
Emmanuel O. Agu
Reza Saadati Fard
Deepak Kumar
Shefalika Gautam
B. Tulu
Diane Strong
MedIm
315
2
0
27 Feb 2025
GONet: A Generalizable Deep Learning Model for Glaucoma Detection
Or Abramovich
Hadas Pizem
Jonathan Fhima
Eran Berkowitz
Ben Gofrit
...
Meital Baskin
Jan Van Eijgen
Ingeborg Stalmans
E. Blumenthal
Joachim A. Behar
190
3
0
26 Feb 2025
MaxGlaViT: A novel lightweight vision transformer-based approach for early diagnosis of glaucoma stages from fundus images
Mustafa Yurdakul
Kubra Uyar
Şakir Tasdemir
306
5
0
24 Feb 2025
MVIP -- A Dataset and Methods for Application Oriented Multi-View and Multi-Modal Industrial Part Recognition
Paul Koch
Marian Schluter
Jörg Krüger
305
0
0
24 Feb 2025
MEX: Memory-efficient Approach to Referring Multi-Object Tracking
International Conference on Autonomic and Trusted Computing (ATC), 2024
Huu-Thien Tran
Phuoc-Sang Pham
Thai-Son Tran
Khoa Luu
VOT
382
1
0
20 Feb 2025
Precise GPS-Denied UAV Self-Positioning via Context-Enhanced Cross-View Geo-Localization
Yuanze Xu
Ming Dai
Wenxiao Cai
Wankou Yang
253
3
0
17 Feb 2025
Without Paired Labeled Data: End-to-End Self-Supervised Learning for Drone-view Geo-Localization
Zhongwei Chen
Zhao-Xu Yang
Hai-Jun Rong
Guoqi Li
SSL
557
1
0
17 Feb 2025
Learning Musical Representations for Music Performance Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Xingjian Diao
Chunhui Zhang
Tingxuan Wu
Ming Cheng
Z. Ouyang
Weiyi Wu
Jiang Gui
282
25
0
10 Feb 2025
Integrating Sequence and Image Modeling in Irregular Medical Time Series Through Self-Supervised Learning
AAAI Conference on Artificial Intelligence (AAAI), 2025
Liuqing Chen
Shuhong Xiao
Shixian Ding
Shanhai Hu
Lingyun Sun
308
0
0
10 Feb 2025
Amnesia as a Catalyst for Enhancing Black Box Pixel Attacks in Image Classification and Object Detection
Neural Information Processing Systems (NeurIPS), 2025
Dongsu Song
Daehwa Ko
Jay Hoon Jung
AAML
480
0
0
10 Feb 2025
Invizo: Arabic Handwritten Document Optical Character Recognition Solution
Alhossien Waly
Bassant Tarek
Ali Feteha
Rewan Yehia
Gasser Amr
Walid Gomaa
Ahmed M. Fares
363
1
0
07 Feb 2025
Addressing Out-of-Label Hazard Detection in Dashcam Videos: Insights from the COOOL Challenge
Anh-Kiet Duong
Petra Gomez-Krämer
387
2
0
27 Jan 2025
A margin-based replacement for cross-entropy loss
Michael W. Spratling
Heiko H. Schütt
318
0
0
21 Jan 2025
A Survey on Memory-Efficient Transformer-Based Model Training in AI for Science
Kaiyuan Tian
Linbo Qiao
Baihui Liu
Gongqingjian Jiang
Shanshan Li
Dongsheng Li
374
0
0
21 Jan 2025
DLEN: Dual Branch of Transformer for Low-Light Image Enhancement in Dual Domains
Junyu Xia
Jiesong Bai
Yihang Dong
ViT
618
4
0
21 Jan 2025
A Remote Sensing Image Change Detection Method Integrating Layer Exchange and Channel-Spatial Differences
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS), 2025
Sijun Dong
Fangcheng Zuo
Geng Chen
Siming Fu
Xiaoliang Meng
344
2
0
19 Jan 2025
Towards Iris Presentation Attack Detection with Foundation Models
IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2025
Juan E. Tapia
Lázaro J. González Soler
Christoph Busch
AAML
VLM
148
4
0
10 Jan 2025
Keypoint Aware Masked Image Modelling
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Madhava Krishna
Convin.AI
454
1
0
03 Jan 2025
VMamba: Visual State Space Model
Neural Information Processing Systems (NeurIPS), 2024
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
1.1K
1,522
0
31 Dec 2024
Adaptive Dataset Quantization
AAAI Conference on Artificial Intelligence (AAAI), 2024
Muquan Li
Dongyang Zhang
Qiang Dong
Xiurui Xie
Ke Qin
DD
MQ
382
3
0
22 Dec 2024
MAGIC++: Efficient and Resilient Modality-Agnostic Semantic Segmentation via Hierarchical Modality Selection
Xu Zheng
Yuanhuiyi Lyu
Lutao Jiang
Jiazhou Zhou
Lin Wang
Xuming Hu
332
12
0
22 Dec 2024
V"Mean"ba: Visual State Space Models only need 1 hidden dimension
Tien-Yu Chi
Hung-Yueh Chiang
Chi-Chih Chang
N. Huang
Kai-Chiang Wu
253
1
0
21 Dec 2024
Safety Monitoring of Machine Learning Perception Functions: a Survey
International Conference on Climate Informatics (ICCI), 2024
Raul Sena Ferreira
Joris Guérin
Kevin Delmas
Jérémie Guiochet
H. Waeselynck
338
5
0
09 Dec 2024
Gesture Classification in Artworks Using Contextual Image Features
Azhar Hussian
Mathias Zinnen
Thi My Hang Tran
Andreas Maier
Vincent Christlein
298
1
0
04 Dec 2024
Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for Benchmarking Robust Machine Learning and Label Correction Methods
Neural Information Processing Systems (NeurIPS), 2024
Jiamian Hu
Yuanyuan Hong
Yihua Chen
He Wang
Moriaki Yasuhara
332
2
0
03 Dec 2024
GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing
Khawar Islam
M. Zaheer
Arif Mahmood
Karthik Nandakumar
Naveed Akhtar
DiffM
650
7
0
03 Dec 2024
MeasureNet: Measurement Based Celiac Disease Identification
Aayush Kumar Tyagi
Vaibhav Mishra
Ashok Tiwari
Lalita Mehra
Prasenjit Das
G. Makharia
Prathosh AP
Mausam
246
0
0
02 Dec 2024
STATIC : Surface Temporal Affine for TIme Consistency in Video Monocular Depth Estimation
Sunghun Yang
Minhyeok Lee
Suhwan Cho
Jungho Lee
Sangyoun Lee
MDE
546
1
0
02 Dec 2024
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
Computer Vision and Pattern Recognition (CVPR), 2024
Alice Heiman
Xiaoman Zhang
E. Chen
Sung Eun Kim
Pranav Rajpurkar
HILM
MedIm
658
5
0
27 Nov 2024
Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning
British Machine Vision Conference (BMVC), 2024
Hoàng-Ân Lê
P. Berg
Minh Pham
299
3
0
26 Nov 2024
GeoFormer: A Multi-Polygon Segmentation Transformer
British Machine Vision Conference (BMVC), 2024
Maxim Khomiakov
Michael Riis Andersen
J. Frellsen
222
1
0
25 Nov 2024
Nd-BiMamba2: A Unified Bidirectional Architecture for Multi-Dimensional Data Processing
Hao Liu
Mamba
AI4CE
288
5
0
22 Nov 2024
ReXrank: A Public Leaderboard for AI-Powered Radiology Report Generation
Xiaoman Zhang
Hong-Yu Zhou
Xiaoli Yang
Oishi Banerjee
J. N. Acosta
Josh Miller
Ouwen Huang
Pranav Rajpurkar
LM&MA
394
15
0
22 Nov 2024
Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal Approach
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2024
Vaishnavi Khindkar
V. Balasubramanian
Chetan Arora
A. Subramanian
C. V. Jawahar
304
0
0
20 Nov 2024
Emotional Images: Assessing Emotions in Images and Potential Biases in Generative Models
Maneet Mehta
Cody Buntain
EGVM
111
4
0
08 Nov 2024
Confidence Calibration of Classifiers with Many Classes
Neural Information Processing Systems (NeurIPS), 2024
Adrien LeCoz
Stéphane Herbin
Faouzi Adjed
UQCV
333
8
0
05 Nov 2024
AM Flow: Adapters for Temporal Processing in Action Recognition
Tanay Agrawal
Abid Ali
A. Dantcheva
François Brémond
246
0
0
04 Nov 2024
Previous
1
2
3
4
5
...
17
18
19
Next