ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.08156
  4. Cited By
AddressCLIP: Empowering Vision-Language Models for City-wide Image
  Address Localization

AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization

11 July 2024
Shixiong Xu
Chenghao Zhang
Lubin Fan
Gaofeng Meng
Shiming Xiang
Jieping Ye
    VLM
ArXivPDFHTML

Papers citing "AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization"

8 / 8 papers shown
Title
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
Ziyue Huang
Hongxi Yan
Qiqi Zhan
Shuai Yang
Mingming Zhang
Chenkai Zhang
Yiming Lei
Zeming Liu
Qingjie Liu
Y. Wang
42
0
0
28 Mar 2025
Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Jianyi Peng
Fan Lu
Bin Li
Yuan Huang
Sanqing Qu
Guang-Sheng Chen
3DPC
74
0
0
17 Feb 2025
AgriCLIP: Adapting CLIP for Agriculture and Livestock via
  Domain-Specialized Cross-Model Alignment
AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment
Umair Nawaz
Muhammad Awais
Hanan Gani
Muzammal Naseer
Fahad Khan
Salman Khan
Rao Muhammad Anwer
VLM
CLIP
28
2
0
02 Oct 2024
MaPLe: Multi-modal Prompt Learning
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
186
521
0
06 Oct 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,010
0
28 Jan 2022
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for
  Place Recognition
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
Stephen Hausler
Sourav Garg
Ming Xu
Michael Milford
Tobias Fischer
39
328
0
02 Mar 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1