Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.08156
Cited By
AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
11 July 2024
Shixiong Xu
Chenghao Zhang
Lubin Fan
Gaofeng Meng
Shiming Xiang
Jieping Ye
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization"
8 / 8 papers shown
Title
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
Ziyue Huang
Hongxi Yan
Qiqi Zhan
Shuai Yang
Mingming Zhang
Chenkai Zhang
Yiming Lei
Zeming Liu
Qingjie Liu
Y. Wang
42
0
0
28 Mar 2025
Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Jianyi Peng
Fan Lu
Bin Li
Yuan Huang
Sanqing Qu
Guang-Sheng Chen
3DPC
74
0
0
17 Feb 2025
AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment
Umair Nawaz
Muhammad Awais
Hanan Gani
Muzammal Naseer
Fahad Khan
Salman Khan
Rao Muhammad Anwer
VLM
CLIP
28
2
0
02 Oct 2024
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
186
521
0
06 Oct 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,010
0
28 Jan 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
Stephen Hausler
Sourav Garg
Ming Xu
Michael Milford
Tobias Fischer
39
328
0
02 Mar 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1