Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2408.14744
Cited By
v1
v2
v3
v4 (latest)
RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models
Isprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2024
27 August 2024
Junyao Ge
Xu Zhang
Yang Zheng
Kaitai Guo
Jimin Liang
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (41★)
Papers citing
"RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models"
47 / 47 papers shown
Geo-R1: Unlocking VLM Geospatial Reasoning with Cross-View Reinforcement Learning
Chenhui Xu
F. Yu
Michael J. Bianco
Jacob Kovarskiy
Raphael Tang
...
Rupanjali Kukal
Mikael Figueroa
Rishi Madhok
Nikolaos Karianakis
Jinjun Xiong
ObjD
ReLM
LRM
223
2
0
29 Sep 2025
Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives
IEEE Geoscience and Remote Sensing Magazine (GRSM), 2025
Xingxing Weng
Chao Pang
Gui-Song Xia
VLM
493
30
0
20 May 2025
GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis
Angelos Zavras
Dimitrios Michail
Xiao Xiang Zhu
Tim Siebert
Ioannis Papoutsis
VLM
516
7
0
13 Feb 2025
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models
IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2024
Haonan Guo
Xin Su
Chen Wu
Bo Du
Guang Dai
Deren Li
LLMAG
310
46
0
17 Jan 2024
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing
Zhecheng Wang
R. Prabha
Tianyuan Huang
Jiajun Wu
Ram Rajagopal
314
161
0
20 Dec 2023
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Computer Vision and Pattern Recognition (CVPR), 2023
Kartik Kuckreja
M. S. Danish
Muzammal Naseer
Abhijit Das
Salman Khan
Fahad Shahbaz Khan
432
397
0
24 Nov 2023
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Computer Vision and Pattern Recognition (CVPR), 2023
Zhang Li
Biao Yang
Qiang Liu
Zhiyin Ma
Shuo Zhang
Jingxu Yang
Yabo Sun
Yuliang Liu
Xiang Bai
MLLM
629
423
0
11 Nov 2023
Efficient Memory Management for Large Language Model Serving with PagedAttention
Symposium on Operating Systems Principles (SOSP), 2023
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
2.3K
5,317
0
12 Sep 2023
RSGPT: A Remote Sensing Vision Language Model and Benchmark
Isprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023
Yuan Hu
Jianlong Yuan
Congcong Wen
Xiaonan Lu
Xiang Li
VLM
373
265
0
28 Jul 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Zilun Zhang
Tiancheng Zhao
Yulong Guo
Yuxiang Cai
DiffM
VLM
1.4K
205
0
20 Jun 2023
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Fan Liu
Delong Chen
Zhan-Rong Guan
Xiaocong Zhou
Jiale Zhu
Qiaolin Ye
Liyong Fu
Jun Zhou
VLM
676
579
0
19 Jun 2023
Improving CLIP Training with Language Rewrites
Neural Information Processing Systems (NeurIPS), 2023
Lijie Fan
Dilip Krishnan
Phillip Isola
Dina Katabi
Yonglong Tian
BDL
VLM
CLIP
537
276
0
31 May 2023
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ning Ding
Yulin Chen
Bokai Xu
Yujia Qin
Zhi Zheng
Shengding Hu
Zhiyuan Liu
Maosong Sun
Bowen Zhou
ALM
477
826
0
23 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Neural Information Processing Systems (NeurIPS), 2023
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLM
VLM
1.9K
3,275
0
11 May 2023
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang
Jinrui Zhang
Junjie Fei
Hao Zheng
Yunlong Tang
Zhe Li
Mingqi Gao
Shanshan Zhao
MLLM
589
131
0
04 May 2023
DataComp: In search of the next generation of multimodal datasets
Neural Information Processing Systems (NeurIPS), 2023
S. Gadre
Gabriel Ilharco
Alex Fang
J. Hayase
Georgios Smyrnis
...
A. Dimakis
J. Jitsev
Y. Carmon
Vaishaal Shankar
Ludwig Schmidt
VLM
822
659
0
27 Apr 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
International Conference on Learning Representations (ICLR), 2023
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLM
MLLM
607
3,021
0
20 Apr 2023
Visual Instruction Tuning
Neural Information Processing Systems (NeurIPS), 2023
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
1.4K
9,060
0
17 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
International Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
1.6K
7,784
0
30 Jan 2023
Reproducible scaling laws for contrastive language-image learning
Computer Vision and Pattern Recognition (CVPR), 2022
Mehdi Cherti
Romain Beaumont
Ross Wightman
Mitchell Wortsman
Gabriel Ilharco
Cade Gordon
Christoph Schuhmann
Ludwig Schmidt
J. Jitsev
VLM
CLIP
733
1,326
0
14 Dec 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
308
215
0
23 Oct 2022
LAION-5B: An open large-scale dataset for training next generation image-text models
Neural Information Processing Systems (NeurIPS), 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
1.5K
4,964
0
16 Oct 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
International Conference on Learning Representations (ICLR), 2022
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLM
VLM
998
963
0
14 Sep 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
913
1,699
0
04 May 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2021
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
362
198
0
21 Apr 2022
Training language models to follow instructions with human feedback
Neural Information Processing Systems (NeurIPS), 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
2.4K
19,843
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
International Conference on Machine Learning (ICML), 2022
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
1.5K
6,390
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Neural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
2.8K
17,183
0
28 Jan 2022
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLM
MLLM
CLIP
995
1,808
0
03 Nov 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
1.1K
149
0
22 Sep 2021
LoRA: Low-Rank Adaptation of Large Language Models
International Conference on Learning Representations (ICLR), 2021
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
1.9K
17,979
0
17 Jun 2021
Scaling Vision with Sparse Mixture of Experts
Neural Information Processing Systems (NeurIPS), 2021
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
492
976
0
10 Jun 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
IEEE International Conference on Computer Vision (ICCV), 2021
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
1.1K
1,535
0
01 Apr 2021
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Annual Meeting of the Association for Computational Linguistics (ACL), 2021
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDL
AI4CE
566
1,900
0
18 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
International Conference on Machine Learning (ICML), 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
2.2K
47,325
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
International Conference on Machine Learning (ICML), 2021
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
1.6K
5,306
0
11 Feb 2021
Language Models are Few-Shot Learners
Neural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.4K
57,120
0
28 May 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
2.3K
7,549
0
23 Jan 2020
Exploring Models and Data for Remote Sensing Image Caption Generation
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2017
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
307
665
0
21 Dec 2017
Functional Map of the World
Gordon A. Christie
Neil Fendley
James Wilson
R. Mukherjee
VGen
578
505
0
21 Nov 2017
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification
P. Helber
B. Bischke
Andreas Dengel
Damian Borth
741
2,563
0
31 Aug 2017
PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval
Isprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2017
Weixun Zhou
Shawn D. Newsam
Congmin Li
Z. Shao
504
528
0
11 Jun 2017
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
569
2,692
0
01 Mar 2017
AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification
IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2016
Gui-Song Xia
Jingwen Hu
Fan Hu
Baoguang Shi
X. Bai
Yanfei Zhong
Liangpei Zhang
399
2,130
0
18 Aug 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
3.5K
6,424
0
23 Feb 2016
ImageNet Large Scale Visual Recognition Challenge
International Journal of Computer Vision (IJCV), 2014
Olga Russakovsky
Gaowen Liu
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
3.7K
42,317
0
01 Sep 2014
Microsoft COCO: Common Objects in Context
European Conference on Computer Vision (ECCV), 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
27.3K
51,996
0
01 May 2014
1
Page 1 of 1