ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.14744
  4. Cited By
RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models
v1v2v3v4 (latest)

RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models

Isprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2024
27 August 2024
Junyao Ge
Xu Zhang
Yang Zheng
Kaitai Guo
Jimin Liang
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (41★)

Papers citing "RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models"

47 / 47 papers shown
Geo-R1: Unlocking VLM Geospatial Reasoning with Cross-View Reinforcement Learning
Geo-R1: Unlocking VLM Geospatial Reasoning with Cross-View Reinforcement Learning
Chenhui Xu
F. Yu
Michael J. Bianco
Jacob Kovarskiy
Raphael Tang
...
Rupanjali Kukal
Mikael Figueroa
Rishi Madhok
Nikolaos Karianakis
Jinjun Xiong
ObjDReLMLRM
223
2
0
29 Sep 2025
Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives
Vision-Language Modeling Meets Remote Sensing: Models, Datasets and PerspectivesIEEE Geoscience and Remote Sensing Magazine (GRSM), 2025
Xingxing Weng
Chao Pang
Gui-Song Xia
VLM
493
30
0
20 May 2025
GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis
GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis
Angelos Zavras
Dimitrios Michail
Xiao Xiang Zhu
Tim Siebert
Ioannis Papoutsis
VLM
516
7
0
13 Feb 2025
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and
  Visual Models
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual ModelsIEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2024
Haonan Guo
Xin Su
Chen Wu
Bo Du
Guang Dai
Deren Li
LLMAG
310
46
0
17 Jan 2024
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for
  Remote Sensing
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing
Zhecheng Wang
R. Prabha
Tianyuan Huang
Jiajun Wu
Ram Rajagopal
314
161
0
20 Dec 2023
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
GeoChat: Grounded Large Vision-Language Model for Remote SensingComputer Vision and Pattern Recognition (CVPR), 2023
Kartik Kuckreja
M. S. Danish
Muzammal Naseer
Abhijit Das
Salman Khan
Fahad Shahbaz Khan
432
397
0
24 Nov 2023
Monkey: Image Resolution and Text Label Are Important Things for Large
  Multi-modal Models
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Zhang Li
Biao Yang
Qiang Liu
Zhiyin Ma
Shuo Zhang
Jingxu Yang
Yabo Sun
Yuliang Liu
Xiang Bai
MLLM
629
423
0
11 Nov 2023
Efficient Memory Management for Large Language Model Serving with
  PagedAttention
Efficient Memory Management for Large Language Model Serving with PagedAttentionSymposium on Operating Systems Principles (SOSP), 2023
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
2.3K
5,317
0
12 Sep 2023
RSGPT: A Remote Sensing Vision Language Model and Benchmark
RSGPT: A Remote Sensing Vision Language Model and BenchmarkIsprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023
Yuan Hu
Jianlong Yuan
Congcong Wen
Xiaonan Lu
Xiang Li
VLM
373
265
0
28 Jul 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large
  Vision-Language Model for Remote Sensing
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote SensingIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Zilun Zhang
Tiancheng Zhao
Yulong Guo
Yuxiang Cai
DiffMVLM
1.4K
205
0
20 Jun 2023
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
RemoteCLIP: A Vision Language Foundation Model for Remote SensingIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Fan Liu
Delong Chen
Zhan-Rong Guan
Xiaocong Zhou
Jiale Zhu
Qiaolin Ye
Liyong Fu
Jun Zhou
VLM
676
579
0
19 Jun 2023
Improving CLIP Training with Language Rewrites
Improving CLIP Training with Language RewritesNeural Information Processing Systems (NeurIPS), 2023
Lijie Fan
Dilip Krishnan
Phillip Isola
Dina Katabi
Yonglong Tian
BDLVLMCLIP
537
276
0
31 May 2023
Enhancing Chat Language Models by Scaling High-quality Instructional
  Conversations
Enhancing Chat Language Models by Scaling High-quality Instructional ConversationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ning Ding
Yulin Chen
Bokai Xu
Yujia Qin
Zhi Zheng
Shengding Hu
Zhiyuan Liu
Maosong Sun
Bowen Zhou
ALM
477
826
0
23 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with
  Instruction Tuning
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction TuningNeural Information Processing Systems (NeurIPS), 2023
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLMVLM
1.9K
3,275
0
11 May 2023
Caption Anything: Interactive Image Description with Diverse Multimodal
  Controls
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang
Jinrui Zhang
Junjie Fei
Hao Zheng
Yunlong Tang
Zhe Li
Mingqi Gao
Shanshan Zhao
MLLM
589
131
0
04 May 2023
DataComp: In search of the next generation of multimodal datasets
DataComp: In search of the next generation of multimodal datasetsNeural Information Processing Systems (NeurIPS), 2023
S. Gadre
Gabriel Ilharco
Alex Fang
J. Hayase
Georgios Smyrnis
...
A. Dimakis
J. Jitsev
Y. Carmon
Vaishaal Shankar
Ludwig Schmidt
VLM
822
659
0
27 Apr 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large
  Language Models
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLMMLLM
607
3,021
0
20 Apr 2023
Visual Instruction Tuning
Visual Instruction TuningNeural Information Processing Systems (NeurIPS), 2023
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDaVLMMLLM
1.4K
9,060
0
17 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLMMLLM
1.6K
7,784
0
30 Jan 2023
Reproducible scaling laws for contrastive language-image learning
Reproducible scaling laws for contrastive language-image learningComputer Vision and Pattern Recognition (CVPR), 2022
Mehdi Cherti
Romain Beaumont
Ross Wightman
Mitchell Wortsman
Gabriel Ilharco
Cade Gordon
Christoph Schuhmann
Ludwig Schmidt
J. Jitsev
VLMCLIP
733
1,326
0
14 Dec 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing
  Data
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing DataIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
308
215
0
23 Oct 2022
LAION-5B: An open large-scale dataset for training next generation
  image-text models
LAION-5B: An open large-scale dataset for training next generation image-text modelsNeural Information Processing Systems (NeurIPS), 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLMMLLMCLIP
1.5K
4,964
0
16 Oct 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
PaLI: A Jointly-Scaled Multilingual Language-Image ModelInternational Conference on Learning Representations (ICLR), 2022
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLMVLM
998
963
0
14 Sep 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLMCLIPOffRL
913
1,699
0
04 May 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote
  Sensing Image Retrieval
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image RetrievalIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2021
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
362
198
0
21 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedbackNeural Information Processing Systems (NeurIPS), 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
2.4K
19,843
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationInternational Conference on Machine Learning (ICML), 2022
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLMBDLVLMCLIP
1.5K
6,390
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
2.8K
17,183
0
28 Jan 2022
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLMMLLMCLIP
995
1,808
0
03 Nov 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning
  Transformers
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
1.1K
149
0
22 Sep 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2021
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRLAI4TSAI4CEALMAIMat
1.9K
17,979
0
17 Jun 2021
Scaling Vision with Sparse Mixture of Experts
Scaling Vision with Sparse Mixture of ExpertsNeural Information Processing Systems (NeurIPS), 2021
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
492
976
0
10 Jun 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Frozen in Time: A Joint Video and Image Encoder for End-to-End RetrievalIEEE International Conference on Computer Vision (ICCV), 2021
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
1.1K
1,535
0
01 Apr 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank InfillingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDLAI4CE
566
1,900
0
18 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language SupervisionInternational Conference on Machine Learning (ICML), 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
2.2K
47,325
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text SupervisionInternational Conference on Machine Learning (ICML), 2021
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLMCLIP
1.6K
5,306
0
11 Feb 2021
Language Models are Few-Shot Learners
Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.4K
57,120
0
28 May 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
2.3K
7,549
0
23 Jan 2020
Exploring Models and Data for Remote Sensing Image Caption Generation
Exploring Models and Data for Remote Sensing Image Caption GenerationIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2017
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
307
665
0
21 Dec 2017
Functional Map of the World
Functional Map of the World
Gordon A. Christie
Neil Fendley
James Wilson
R. Mukherjee
VGen
578
505
0
21 Nov 2017
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and
  Land Cover Classification
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification
P. Helber
B. Bischke
Andreas Dengel
Damian Borth
741
2,563
0
31 Aug 2017
PatternNet: A Benchmark Dataset for Performance Evaluation of Remote
  Sensing Image Retrieval
PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image RetrievalIsprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2017
Weixun Zhou
Shawn D. Newsam
Congmin Li
Z. Shao
504
528
0
11 Jun 2017
Remote Sensing Image Scene Classification: Benchmark and State of the
  Art
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
569
2,692
0
01 Mar 2017
AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene
  Classification
AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene ClassificationIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2016
Gui-Song Xia
Jingwen Hu
Fan Hu
Baoguang Shi
X. Bai
Yanfei Zhong
Liangpei Zhang
399
2,130
0
18 Aug 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
3.5K
6,424
0
23 Feb 2016
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition ChallengeInternational Journal of Computer Vision (IJCV), 2014
Olga Russakovsky
Gaowen Liu
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLMObjD
3.7K
42,317
0
01 Sep 2014
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
27.3K
51,996
0
01 May 2014
1
Page 1 of 1