v1v2 (latest)

Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

Computer Vision and Pattern Recognition (CVPR), 2020

3 April 2020

ArXiv (abs)PDF HTML Github (794★)

Papers citing "Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval"

50 / 246 papers shown

Beyond Generation: Multi-Hop Reasoning for Factual Accuracy in Vision-Language Models

Shamima Hossain

LRM

176

25 Nov 2025

LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight

121

25 Nov 2025

GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

408

19 Nov 2025

Real-World Adverse Weather Image Restoration via Dual-Level Reinforcement Learning with High-Quality Cold Start

132

07 Nov 2025

Instance-Level Composed Image Retrieval

163

29 Oct 2025

FineVision: Open Data Is All You Need

Aritra Roy Gosthipaty

Andrés Marafioti

VLM

195

20 Oct 2025

CaMiT: A Time-Aware Car Model Dataset for Classification and Generation

Frédéric LIN

Biruk Abere Ambaw

Adrian Daniel Popescu

288

20 Oct 2025

An Experimental Study of Real-Life LLM-Proposed Performance Improvements

Lirong Yi

Gregory Gay

Philipp Leitner

17 Oct 2025

EgMM-Corpus: A Multimodal Vision-Language Dataset for Egyptian Culture

17 Oct 2025

Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering

202

16 Oct 2025

Seeing and Knowing in the Wild: Open-domain Visual Entity Recognition with Large-scale Knowledge Graphs via Contrastive Learning

137

15 Oct 2025

Where on Earth? A Vision-Language Benchmark for Probing Model Geolocation Skills Across Scales

...

102

13 Oct 2025

Instance-Level Generation for Representation Learning

Yankun Wu

Zakaria Laskar

Giorgos Kordopatis-Zilos

Noa Garcia

Giorgos Tolias

141

10 Oct 2025

MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning

Tajamul Ashraf

Umair Nawaz

Abdelrahman M. Shaker

227

09 Oct 2025

The Overlooked Value of Test-time Reference Sets in Visual Place Recognition

113

04 Oct 2025

UniMapGen: A Generative Framework for Large-Scale Map Construction from Multi-modal Data

156

26 Sep 2025

EfficientDepth: A Fast and Detail-Preserving Monocular Depth Estimation Model

182

26 Sep 2025

MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks

144

18 Sep 2025

Improving Alignment in LVLMs with Debiased Self-Judgment

217

28 Aug 2025

Assessing the Geolocation Capabilities, Limitations and Societal Risks of Generative Vision-Language Models

27 Aug 2025

Can VLMs Recall Factual Associations From Visual References?

22 Aug 2025

PCHands: PCA-based Hand Pose Synergy Representation on Manipulators with N-DoF

110

11 Aug 2025

Large Language Models Facilitate Vision Reflection in Image Classification

02 Aug 2025

Meta CLIP 2: A Worldwide Scaling Recipe

...

369

29 Jul 2025

ReasonVQA: A Multi-hop Reasoning Benchmark with Structural Knowledge for Visual Question Answering

189

22 Jul 2025

UniLGL: Learning Uniform Place Recognition for FOV-limited/Panoramic LiDAR Global Localization

253

16 Jul 2025

Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view ImagesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

155

09 Jun 2025

Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization

332

04 Jun 2025

UAVPairs: A Challenging Benchmark for Match Pair Retrieval of Large-scale UAV Images

133

28 May 2025

VIBE: Vector Index Benchmark for Embeddings

343

23 May 2025

Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs

474

21 May 2025

Beginning with You: Perceptual-Initialization Improves Vision-Language Representation and Alignment

279

20 May 2025

Synthetic History: Evaluating Visual Representations of the Past in Diffusion Models

Maria-Teresa De Rosa Palmini

Eva Cetinic

262

18 May 2025

Artifacts of Idiosyncracy in Global Street View DataConference on Fairness, Accountability and Transparency (FAccT), 2025

184

16 May 2025

StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation

Daniel A. P. Oliveira

David Martins de Matos

VGen

241

15 May 2025

OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal RetrievalAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

351

10 May 2025

Learning Compatible Multi-Prize Subnetworks for Asymmetric RetrievalComputer Vision and Pattern Recognition (CVPR), 2025

263

16 Apr 2025

MIEB: Massive Image Embedding Benchmark

491

14 Apr 2025

Evolved Hierarchical Masking for Self-Supervised LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

Zhanzhou Feng

Shiliang Zhang

370

12 Apr 2025

Boosting multi-demographic federated learning for chest radiograph analysis using general-purpose self-supervised representations

Soroosh Tayebi Arasteh

336

11 Apr 2025

Taxonomy-Aware Evaluation of Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2025

Vésteinn Snæbjarnarson

283

07 Apr 2025

LOCORE: Image Re-ranking with Long-Context Sequence ModelingComputer Vision and Pattern Recognition (CVPR), 2025

Giorgos Kordopatis-Zilos

Giorgos Tolias

Vicente Ordonez

264

27 Mar 2025

377

26 Mar 2025

Distilling Monocular Foundation Model for Fine-grained Depth CompletionComputer Vision and Pattern Recognition (CVPR), 2025

291

21 Mar 2025

Prototype Perturbation for Relaxing Alignment Constraints in Backward-Compatible Learning

334

19 Mar 2025

RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment

450

18 Mar 2025

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

296

17 Mar 2025

Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization

267

10 Mar 2025

Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search

Daniel de Souza Severo

294

16 Jan 2025

MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training

291

13 Jan 2025