RSGPT: A Remote Sensing Vision Language Model and BenchmarkIsprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023

Yuan Hu

Jianlong Yuan

Congcong Wen

Xiaonan Lu

Xiang Li

VLM

373

265

28 Jul 2023

RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote SensingIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023

1.4K

205

20 Jun 2023

RemoteCLIP: A Vision Language Foundation Model for Remote SensingIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023

676

579

19 Jun 2023

Improving CLIP Training with Language RewritesNeural Information Processing Systems (NeurIPS), 2023

537

276

31 May 2023

Enhancing Chat Language Models by Scaling High-quality Instructional ConversationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Zhiyuan Liu

Maosong Sun

Bowen Zhou

ALM

477

826

23 May 2023

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction TuningNeural Information Processing Systems (NeurIPS), 2023

1.9K

3,275

11 May 2023

Caption Anything: Interactive Image Description with Diverse Multimodal Controls

589

131

04 May 2023

DataComp: In search of the next generation of multimodal datasetsNeural Information Processing Systems (NeurIPS), 2023

...

822

659

27 Apr 2023

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

607

3,021

20 Apr 2023

Visual Instruction TuningNeural Information Processing Systems (NeurIPS), 2023

1.4K

9,060

17 Apr 2023

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsInternational Conference on Machine Learning (ICML), 2023

Silvio Savarese

1.6K

7,784

30 Jan 2023

Reproducible scaling laws for contrastive language-image learningComputer Vision and Pattern Recognition (CVPR), 2022

733

1,326

14 Dec 2022

RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing DataIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022

Yangfan Zhan

Zhitong Xiong

Yuan. Yuan

308

215

23 Oct 2022

LAION-5B: An open large-scale dataset for training next generation image-text modelsNeural Information Processing Systems (NeurIPS), 2022

...

1.5K

4,964

16 Oct 2022

PaLI: A Jointly-Scaled Multilingual Language-Image ModelInternational Conference on Learning Representations (ICLR), 2022

...

998

963

14 Sep 2022

CoCa: Contrastive Captioners are Image-Text Foundation Models

Mojtaba Seyedhosseini

Yonghui Wu

VLM CLIP OffRL

913

1,699

04 May 2022

Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image RetrievalIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2021

Xian Sun

362

198

21 Apr 2022

Training language models to follow instructions with human feedbackNeural Information Processing Systems (NeurIPS), 2022

Carroll L. Wainwright

...

2.4K

19,843

04 Mar 2022

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationInternational Conference on Machine Learning (ICML), 2022

1.5K

6,390

28 Jan 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

2.8K

17,183

28 Jan 2022

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

995

1,808

03 Nov 2021

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

Sharan Narang

1.1K

149

22 Sep 2021

LoRA: Low-Rank Adaptation of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2021

OffRL AI4TS AI4CE ALM AIMat

1.9K

17,979

17 Jun 2021

Scaling Vision with Sparse Mixture of ExpertsNeural Information Processing Systems (NeurIPS), 2021

492

976

10 Jun 2021

Frozen in Time: A Joint Video and Image Encoder for End-to-End RetrievalIEEE International Conference on Computer Vision (ICCV), 2021

1.1K

1,535

01 Apr 2021

GLM: General Language Model Pretraining with Autoregressive Blank InfillingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Zhengxiao Du

Yujie Qian

Xiao Liu

Ming Ding

566

1,900

18 Mar 2021

Learning Transferable Visual Models From Natural Language SupervisionInternational Conference on Machine Learning (ICML), 2021

...

2.2K

47,325

26 Feb 2021

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text SupervisionInternational Conference on Machine Learning (ICML), 2021

1.6K

5,306

11 Feb 2021

Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020

...

2.4K

57,120

28 May 2020

Scaling Laws for Neural Language Models

2.3K

7,549

23 Jan 2020

Exploring Models and Data for Remote Sensing Image Caption GenerationIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2017

Xiaoqiang Lu

Binqiang Wang

Xiangtao Zheng

Xuelong Li

307

665

21 Dec 2017

Functional Map of the World

578

505

21 Nov 2017

EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification

741

2,563

31 Aug 2017

PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image RetrievalIsprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2017

504

528

11 Jun 2017

Remote Sensing Image Scene Classification: Benchmark and State of the Art

Gong Cheng

Junwei Han

Xiaoqiang Lu

569

2,692

01 Mar 2017

AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene ClassificationIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2016

Yanfei Zhong

399

2,130

18 Aug 2016

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

...

Fei-Fei Li

3.5K

6,424

23 Feb 2016

ImageNet Large Scale Visual Recognition ChallengeInternational Journal of Computer Vision (IJCV), 2014

...

Li Fei-Fei

3.7K

42,317

01 Sep 2014

Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014

Piotr Dollár

27.3K

51,996

01 May 2014