Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.15826
Cited By
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
24 November 2023
Kartik Kuckreja
M. S. Danish
Muzammal Naseer
Abhijit Das
Salman Khan
Fahad Shahbaz Khan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GeoChat: Grounded Large Vision-Language Model for Remote Sensing"
50 / 88 papers shown
Title
MilChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Remote Sensing
Aybora Koksal
Aydin Alatan
LRM
9
0
0
12 May 2025
LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery
Jerome Quenum
Wen-Han Hsieh
Tsung-Han Wu
Ritwik Gupta
Trevor Darrell
David M. Chan
MLLM
VLM
49
0
0
05 May 2025
Exploring Generalizable Pre-training for Real-world Change Detection via Geometric Estimation
Yitao Zhao
Sen Lei
Nanqing Liu
Heng Li
Turgay Celik
Qing Zhu
24
0
0
19 Apr 2025
EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery
Wei Zhang
Miaoxin Cai
Yaqian Ning
T. Zhang
Yin Zhuang
He Chen
Jun Li
Xuerui Mao
36
0
0
17 Apr 2025
SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling
Yasin Almalioglu
Andrzej Kucik
Geoffrey French
Dafni Antotsiou
Alexander Adam
Cedric Archambeau
16
0
0
17 Apr 2025
Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization
Darryl Hannan
John Cooper
Dylan White
Timothy Doster
Henry Kvinge
Y. Watkins
24
0
0
14 Apr 2025
AeroLite: Tag-Guided Lightweight Generation of Aerial Image Captions
Xing Zi
Tengjun Ni
Xianjing Fan
Xian Tao
Jun Li
Ali Braytee
Mukesh Prasad
21
0
0
13 Apr 2025
RS-RAG: Bridging Remote Sensing Imagery and Comprehensive Knowledge with a Multi-Modal Dataset and Retrieval-Augmented Generation Model
Congcong Wen
Yiting Lin
Xiaokang Qu
Nan Li
Yong Liao
Hui Lin
Xiang Li
20
0
0
07 Apr 2025
SARLANG-1M: A Benchmark for Vision-Language Modeling in SAR Image Understanding
Yimin Wei
Aoran Xiao
Yexian Ren
Yuting Zhu
Hongruixuan Chen
J. Xia
Naoto Yokoya
VLM
66
0
0
04 Apr 2025
NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving
Kexin Tian
Jingrui Mao
Y. Zhang
Jiwan Jiang
Yang Zhou
Zhengzhong Tu
CoGe
60
0
0
04 Apr 2025
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan
A. Ahmed
Mohamad Alansari
Neha Gour
Abderaouf Behouch
...
Muzammal Naseer
Juergen Gall
Mohammed Bennamoun
Ernesto Damiani
N. Werghi
39
0
0
03 Apr 2025
XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?
Fengxiang Wang
H. Wang
Mingshuo Chen
Di Wang
Yulin Wang
...
L. Lan
Wenjing Yang
J. Zhang
Zhiyuan Liu
Maosong Sun
52
2
0
31 Mar 2025
FlexiMo: A Flexible Remote Sensing Foundation Model
Xuyang Li
Chenyu Li
Pedram Ghamisi
Danfeng Hong
37
0
0
31 Mar 2025
EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing
Hongxiang Jiang
Jihao Yin
Qixiong Wang
Jiaqi Feng
Guo Chen
46
0
0
30 Mar 2025
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
Ziyue Huang
Hongxi Yan
Qiqi Zhan
Shuai Yang
Mingming Zhang
Chenkai Zhang
Yiming Lei
Zeming Liu
Qingjie Liu
Y. Wang
42
0
0
28 Mar 2025
FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs
Xiaoqin Wang
Xusen Ma
Xianxu Hou
Meidan Ding
Yudong Li
Junliang Chen
Wenting Chen
Xiaoyang Peng
LinLin Shen
CVBM
71
0
0
27 Mar 2025
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text
Weizhi Chen
Jingbo Chen
Yupeng Deng
Jiansheng Chen
Yuman Feng
Zhihao Xi
Diyou Liu
Kai Li
Yu Meng
VLM
51
0
0
25 Mar 2025
A Vision Centric Remote Sensing Benchmark
Abduljaleel Adejumo
Faegheh Yeganli
Clifford Broni-Bediako
Aoran Xiao
Naoto Yokoya
Mennatullah Siam
53
0
0
20 Mar 2025
OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence
Long Yuan
Fengran Mo
Kaiyu Huang
Wenjie Wang
Wangyuxuan Zhai
Xiaoyu Zhu
You Li
Jinan Xu
Jian-Yun Nie
SyDa
56
0
0
20 Mar 2025
GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing
Zilun Zhang
Haozhan Shen
Tiancheng Zhao
Bin Chen
Zian Guan
Yuhao Wang
Xu Jia
Yuxiang Cai
Yongheng Shang
Jianwei Yin
49
0
0
16 Mar 2025
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Junwei Luo
Yingying Zhang
X. J. Yang
Kang Wu
Qi Zhu
Lei Liang
Jingdong Chen
Yansheng Li
62
0
0
10 Mar 2025
A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
Qing Zhou
Tao Yang
Junyu Gao
W. Ni
Junzheng Wu
Qi Wang
43
0
0
06 Mar 2025
Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models
Dilxat Muhtar
Enzhuo Zhang
Zhenshi Li
Feng-Xue Gu
Yanglangxing He
P. Xiao
Xueliang Zhang
36
2
0
02 Mar 2025
Remote Sensing Semantic Segmentation Quality Assessment based on Vision Language Model
Huiying Shi
Z. Tan
Zhihan Zhang
Hongchen Wei
Yaosi Hu
Yingxue Zhang
Zhenzhong Chen
72
0
0
21 Feb 2025
TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data
Jeremy Irvin
Emily Ruoyu Liu
Joyce Chuyi Chen
Ines Dormoy
Jinyoung Kim
Samar Khanna
Zhuo Zheng
Stefano Ermon
MLLM
VLM
50
4
0
28 Jan 2025
Multi-Agent Geospatial Copilots for Remote Sensing Workflows
Chaehong Lee
Varatheepan Paramanayakam
Andreas Karatzas
Yanan Jian
Michael Fore
Heming Liao
Fuxun Yu
Ruopu Li
Iraklis Anagnostopoulos
Dimitrios Stamoulis
31
2
0
28 Jan 2025
Meta-Feature Adapter: Integrating Environmental Metadata for Enhanced Animal Re-identification
Yuzhuo Li
Di Zhao
Yihao Wu
Yun Sing Koh
69
0
0
23 Jan 2025
AgroGPT: Efficient Agricultural Vision-Language Model with Expert Tuning
Muhammad Awais
Ali Husain Salem Abdulla Alharthi
Amandeep Kumar
Hisham Cholakkal
Rao Muhammad Anwer
VLM
60
3
0
10 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
83
10
0
06 Jan 2025
FedRSClip: Federated Learning for Remote Sensing Scene Classification Using Vision-Language Models
Hui Lin
Chao Zhang
Danfeng Hong
Kexin Dong
Congcong Wen
FedML
VLM
21
3
0
05 Jan 2025
Advancements in Visual Language Models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques
Lijie Tao
H. Zhang
Haizhao Jing
Yu Liu
Kelu Yao
Guoting Wei
Xizhe Xue
33
0
0
03 Jan 2025
REO-VLM: Transforming VLM to Meet Regression Challenges in Earth Observation
Xizhe Xue
Guoting Wei
Hao Chen
H. Zhang
Feng Lin
Chunhua Shen
Xiao Xiang Zhu
91
3
0
21 Dec 2024
SignEye: Traffic Sign Interpretation from Vehicle First-Person View
Chuang Yang
Xu Han
T. Han
Yuejiao Su
Junyu Gao
Hongyuan Zhang
Yi Wang
Lap-Pui Chau
77
0
0
18 Nov 2024
Large Vision-Language Models for Remote Sensing Visual Question Answering
Surasakdi Siripong
Apirak Chaiyapan
Thanakorn Phonchai
20
0
0
16 Nov 2024
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
Y. Zhou
Mengcheng Lan
Xiang Li
Yiping Ke
Xue Jiang
Litong Feng
Qingyun Li
Xue Yang
Wayne Zhang
ObjD
VLM
109
4
0
16 Nov 2024
DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation Benchmark
Haodong Li
Haicheng Qu
Xiaofeng Zhang
33
1
0
05 Nov 2024
SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey
Kien X. Nguyen
Fengchun Qiao
Arthur Trembanis
Xi Peng
21
0
0
31 Oct 2024
Multilingual Vision-Language Pre-training for the Remote Sensing Domain
João Daniel Silva
João Magalhães
D. Tuia
Bruno Martins
CLIP
VLM
30
1
0
30 Oct 2024
GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing
Hosam Elgendy
Ahmed Sharshar
Ahmed Aboeitta
Yasser Ashraf
Mohsen Guizani
19
2
0
25 Oct 2024
CAMEL-Bench: A Comprehensive Arabic LMM Benchmark
Sara Ghaboura
Ahmed Heakl
Omkar Thawakar
Ali Alharthi
Ines Riahi
Abduljalil Saif
Jorma T. Laaksonen
F. Khan
Salman Khan
Rao Muhammad Anwer
40
0
0
24 Oct 2024
An LLM Agent for Automatic Geospatial Data Analysis
Yuxing Chen
Weijie Wang
Sylvain Lobry
Camille Kurtz
LLMAG
32
3
0
24 Oct 2024
Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Zhangwei Gao
Zhe Chen
Erfei Cui
Yiming Ren
Weiyun Wang
...
Lewei Lu
Tong Lu
Yu Qiao
Jifeng Dai
Wenhai Wang
VLM
62
22
0
21 Oct 2024
MMDS: A Multimodal Medical Diagnosis System Integrating Image Analysis and Knowledge-based Departmental Consultation
Yi Ren
H. Zhang
Weibin Li
Diandong Liu
Tianyi Zhang
Jie He
Jie He
Licheng Jiao
CVBM
24
0
0
20 Oct 2024
RescueADI: Adaptive Disaster Interpretation in Remote Sensing Images with Autonomous Agents
Zhuoran Liu
Danpei Zhao
Bo Yuan
18
1
0
17 Oct 2024
FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Yuki Imajuku
Yoko Yamakata
Kiyoharu Aizawa
24
1
0
27 Sep 2024
CDChat: A Large Multimodal Model for Remote Sensing Change Description
Mubashir Noman
Noor Ahsan
Muzammal Naseer
Hisham Cholakkal
Rao Muhammad Anwer
Salman Khan
F. Khan
28
2
0
24 Sep 2024
Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing Image Segmentation
Sen Lei
Xinyu Xiao
Heng-Chao Li
Z. Shi
Qing Zhu
18
12
0
20 Sep 2024
Generalized Few-Shot Semantic Segmentation in Remote Sensing: Challenge and Benchmark
Clifford Broni-Bediako
Junshi Xia
Jian Song
Hongruixuan Chen
Mennatullah Siam
Naoto Yokoya
16
4
0
17 Sep 2024
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
Baichuan Zhou
Haote Yang
Dairong Chen
Junyan Ye
Tianyi Bai
Jinhua Yu
Songyang Zhang
Dahua Lin
Conghui He
Weijia Li
VLM
53
3
0
30 Aug 2024
RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models
Junyao Ge
Yang Zheng
Kaitai Guo
Jimin Liang
Jimin Liang
27
1
0
27 Aug 2024
1
2
Next