Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2410.06234
Cited By

TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data

TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data

International Conference on Learning Representations (ICLR), 2024

28 January 2025

Emily Ruoyu Liu

Joyce Chuyi Chen

ArXiv (abs)PDF HTML

Papers citing "TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data"

50 / 65 papers shown

Multilingual Training-Free Remote Sensing Image Captioning

Multilingual Training-Free Remote Sensing Image Captioning

João Daniel Silva

100

0

0

30 Nov 2025

GeoZero: Incentivizing Reasoning from Scratch on Geospatial Scenes

GeoZero: Incentivizing Reasoning from Scratch on Geospatial Scenes

...

112

1

0

27 Nov 2025

Co-Training Vision Language Models for Remote Sensing Multi-task Learning

Co-Training Vision Language Models for Remote Sensing Multi-task Learning

...

179

1

0

26 Nov 2025

Think First, Assign Next (ThiFAN-VQA): A Two-stage Chain-of-Thought Framework for Post-Disaster Damage Assessment

Think First, Assign Next (ThiFAN-VQA): A Two-stage Chain-of-Thought Framework for Post-Disaster Damage Assessment

Maryam Rahnemoonfar

96

0

0

24 Nov 2025

REMSA: An LLM Agent for Foundation Model Selection in Remote Sensing

REMSA: An LLM Agent for Foundation Model Selection in Remote Sensing

Tacettin Emre Bök

104

0

0

21 Nov 2025

The Potential of Copernicus Satellites for Disaster Response: Retrieving Building Damage from Sentinel-1 and Sentinel-2

The Potential of Copernicus Satellites for Disaster Response: Retrieving Building Damage from Sentinel-1 and Sentinel-2

Olivier Dietrich

Merlin Alfredsson

L. Scheibenreif

Jan Dirk Wegner

Konrad Schindler

105

0

0

07 Nov 2025

DescribeEarth: Describe Anything for Remote Sensing Images

DescribeEarth: Describe Anything for Remote Sensing Images

129

1

0

30 Sep 2025

Geo-R1: Unlocking VLM Geospatial Reasoning with Cross-View Reinforcement Learning

Geo-R1: Unlocking VLM Geospatial Reasoning with Cross-View Reinforcement Learning

Michael J. Bianco

Jacob Kovarskiy

...

Rupanjali Kukal

Mikael Figueroa

Nikolaos Karianakis

151

0

0

29 Sep 2025

GeoVLM-R1: Reinforcement Fine-Tuning for Improved Remote Sensing Reasoning

GeoVLM-R1: Reinforcement Fine-Tuning for Improved Remote Sensing Reasoning

Fahad Shahbaz Khan

ObjD OffRL VLM LRM

322

2

0

29 Sep 2025

BTCChat: Advancing Remote Sensing Bi-temporal Change Captioning with Multimodal Large Language Model

BTCChat: Advancing Remote Sensing Bi-temporal Change Captioning with Multimodal Large Language Model

121

0

0

07 Sep 2025

RSCC: A Large-Scale Remote Sensing Change Caption Dataset for Disaster Events

RSCC: A Large-Scale Remote Sensing Change Caption Dataset for Disaster Events

242

2

0

02 Sep 2025

ChatENV: An Interactive Vision-Language Model for Sensor-Guided Environmental Monitoring and Scenario Simulation

ChatENV: An Interactive Vision-Language Model for Sensor-Guided Environmental Monitoring and Scenario Simulation

148

0

0

14 Aug 2025

Remote Sensing Image Intelligent Interpretation with the Language-Centered Perspective: Principles, Methods and Challenges

Remote Sensing Image Intelligent Interpretation with the Language-Centered Perspective: Principles, Methods and Challenges

139

1

0

09 Aug 2025

MONITRS: Multimodal Observations of Natural Incidents Through Remote Sensing

MONITRS: Multimodal Observations of Natural Incidents Through Remote Sensing

Shreelekha Revankar

Cheng Perng Phoo

Bharath Hariharan

125

0

0

22 Jul 2025

TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting

TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting

282

0

0

23 Jun 2025

Domain Specific Benchmarks for Evaluating Multimodal Large Language Models

Domain Specific Benchmarks for Evaluating Multimodal Large Language Models

Muhammad Arbab Arshad

Efstathios Polyzos

...

Nishith Reddy Mannuru

Ravi Varma Kumar Bevara

Muhammad Zeeshan Akram

397

2

0

15 Jun 2025

ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks

ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks

Akashah Shabbir

Muhammad Akhtar Munir

Muhammad Umer Sheikh

Juan Bernabé-Moreno

Fahad Shahbaz Khan

227

4

0

29 May 2025

DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response

DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response

...

Hongruixuan Chen

434

10

0

27 May 2025

Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives

Vision-Language Modeling Meets Remote Sensing: Models, Datasets and PerspectivesIEEE Geoscience and Remote Sensing Magazine (GRSM), 2025

350

12

0

20 May 2025

LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery

LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery

280

4

0

05 May 2025

Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization

Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization

244

1

0

14 Apr 2025

Operational Change Detection for Geographical Information: Overview and Challenges

Operational Change Detection for Geographical Information: Overview and Challenges

Nicolas Gonthier

352

0

0

18 Mar 2025

Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models

Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models

Yanglangxing He

294

8

0

02 Mar 2025

Investigating and Mitigating the Multimodal Hallucination Snowballing in
Large Vision-Language Models

Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models

Bing Qin

507

19

0

30 Jun 2024

RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote
Sensing Image Understanding

RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding

Haifeng Li

253

9

0

18 Jun 2024

SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for
Remote Sensing Vision-Language Understanding

SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding

Yongjun Zhang

...

Yansheng Li

377

66

0

14 Jun 2024

ST-LLM: Large Language Models Are Effective Temporal Learners

ST-LLM: Large Language Models Are Effective Temporal Learners

Ying Shan

193

123

0

30 Mar 2024

ChatEarthNet: A Global-Scale Image-Text Dataset Empowering
Vision-Language Geo-Foundation Models

ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models

Zhitong Xiong

Xiao Xiang Zhu

239

19

0

17 Feb 2024

LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal
Language Model

LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model

Xue-liang Zhang

493

124

0

04 Feb 2024

EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor
Image Comprehension in Remote Sensing Domain

EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain

Tong Zhang

430

212

0

30 Jan 2024

SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction
Tuning with Large Language Model

SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model

Zhitong Xiong

247

113

0

18 Jan 2024

A Comprehensive Survey of Hallucination Mitigation Techniques in Large
Language Models

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

S.M. Towhidul Islam Tonmoy

Vinija Jain

461

343

0

02 Jan 2024

SkyScript: A Large and Semantically Diverse Vision-Language Dataset for
Remote Sensing

SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing

221

126

0

20 Dec 2023

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

LLaMA-VID: An Image is Worth 2 Tokens in Large Language ModelsEuropean Conference on Computer Vision (ECCV), 2023

327

477

0

28 Nov 2023

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

GeoChat: Grounded Large Vision-Language Model for Remote SensingComputer Vision and Pattern Recognition (CVPR), 2023

Kartik Kuckreja

Muzammal Naseer

Salman Khan

Fahad Shahbaz Khan

321

295

0

24 Nov 2023

Video-LLaVA: Learning United Visual Representation by Alignment Before
Projection

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Bin Lin

1.6K

1,168

0

16 Nov 2023

Chat-UniVi: Unified Visual Representation Empowers Large Language Models
with Image and Video Understanding

Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video UnderstandingComputer Vision and Pattern Recognition (CVPR), 2023

Ryuichi Takanobu

508

349

0

14 Nov 2023

NExT-Chat: An LMM for Chat, Detection and Segmentation

NExT-Chat: An LMM for Chat, Detection and Segmentation

Ao Zhang

Wei Ji

Zhiyuan Liu

355

73

0

08 Nov 2023

Improved Baselines with Visual Instruction Tuning

Improved Baselines with Visual Instruction TuningComputer Vision and Pattern Recognition (CVPR), 2023

606

4,130

0

05 Oct 2023

LanguageBind: Extending Video-Language Pretraining to N-modality by
Language-based Semantic Alignment

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic AlignmentInternational Conference on Learning Representations (ICLR), 2023

Bin Lin

...

Wei Liu

740

334

0

03 Oct 2023

RSGPT: A Remote Sensing Vision Language Model and Benchmark

RSGPT: A Remote Sensing Vision Language Model and BenchmarkIsprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023

269

208

0

28 Jul 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models

Llama 2: Open Foundation and Fine-Tuned Chat Models

Louis Martin

Amjad Almahairi

...

Sharan Narang

Aurelien Rodriguez

Sergey Edunov

7.8K

15,207

0

18 Jul 2023

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and
Language Models

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Salman Khan

Fahad Shahbaz Khan

421

947

0

08 Jun 2023

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video
Understanding

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Xin Li

566

1,478

0

05 Jun 2023

A Survey of Safety and Trustworthiness of Large Language Models through
the Lens of Verification and Validation

A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and ValidationArtificial Intelligence Review (AIR), 2023

...

Mustafa A. Mustafa

351

143

0

19 May 2023

Change Detection Methods for Remote Sensing in the Last Decade: A
Comprehensive Review

Change Detection Methods for Remote Sensing in the Last Decade: A Comprehensive ReviewRemote Sensing (RS), 2023

Guangliang Cheng

Xiangtai Li

216

159

0

09 May 2023

Visual Instruction Tuning

Visual Instruction TuningNeural Information Processing Systems (NeurIPS), 2023

1.1K

7,377

0

17 Apr 2023

GPT-4 Technical Report

GPT-4 Technical Report

OpenAI Josh Achiam

Sandhini Agarwal

...

4.6K

20,717

0

15 Mar 2023

SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery

SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite ImageryNeural Information Processing Systems (NeurIPS), 2022

David B. Lobell

470

402

0

17 Jul 2022

Change Detection Meets Visual Question Answering

Change Detection Meets Visual Question Answering

Zhitong Xiong

Xiaoxiang Zhu

231

60

0

12 Dec 2021