Learning Semantic Concepts and Order for Image and Sentence Matching

6 December 2017

Yan Huang

Qi Wu

Liang Wang

VLM

ArXiv (abs)PDF HTML

Papers citing "Learning Semantic Concepts and Order for Image and Sentence Matching"

50 / 93 papers shown

Category-level Text-to-Image Retrieval Improved: Bridging the Domain Gap with Diffusion Models and Vision Encoders

100

29 Aug 2025

Aligning Information Capacity Between Vision and Language via Dense-to-Sparse Feature Distillation for Image-Text Matching

396

19 Mar 2025

Cross-Modal Pre-Aligned Method with Global and Local Information for Remote-Sensing Image and Text RetrievalIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024

254

22 Nov 2024

GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric LearningIEEE Transactions on Image Processing (TIP), 2024

Haiwen Diao

Ying Zhang

Shang Gao

Jiawen Zhu

Long Chen

Huchuan Lu

229

20 Oct 2024

GraspMamba: A Mamba-based Language-driven Grasp Detection Framework with Hierarchical Feature Learning

Huy Hoang Nguyen

An Vuong

Anh Nguyen

Ian Reid

Minh Nhat Vu

Mamba

253

22 Sep 2024

Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint LearningACM Multimedia (MM), 2024

277

01 Aug 2024

Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching

Xuri Ge

Joemon M. Jose

234

05 Jun 2024

Transcending Fusion: A Multi-Scale Alignment Method for Remote Sensing Image-Text Retrieval

248

29 May 2024

Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching

Huchuan Lu

311

28 Apr 2024

3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting

299

26 Apr 2024

CFIR: Fast and Effective Long-Text To Image Retrieval for Large Corpora

336

23 Feb 2024

Active Mining Sample Pair Semantics for Image-text Matching

211

09 Nov 2023

Vision-Language Dataset Distillation

456

15 Aug 2023

Improved Probabilistic Image-Text RepresentationsInternational Conference on Learning Representations (ICLR), 2023

Sanghyuk Chun

VLM

604

29 May 2023

Vision-Language Models in Remote Sensing: Current Progress and Future TrendsIEEE Geoscience and Remote Sensing Magazine (GRSM), 2023

Xiao Xiang Zhu

358

165

09 May 2023

Learning Bottleneck Concepts in Image ClassificationComputer Vision and Pattern Recognition (CVPR), 2023

260

20 Apr 2023

Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete TokensComputer Vision and Pattern Recognition (CVPR), 2023

Jianbo Yuan

Hongxia Yang

236

27 Mar 2023

Plug-and-Play Regulators for Image-Text MatchingIEEE Transactions on Image Processing (IEEE TIP), 2023

Huchuan Lu

222

23 Mar 2023

LIMITR: Leveraging Local Information for Medical Image-Text RepresentationIEEE International Conference on Computer Vision (ICCV), 2023

Gefen Dawidowicz

Elad Hirsch

A. Tal

191

21 Mar 2023

Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening

Min Zhang

203

14 Mar 2023

Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis

251

11 Feb 2023

HierVL: Learning Hierarchical Video-Language EmbeddingsComputer Vision and Pattern Recognition (CVPR), 2023

447

05 Jan 2023

Scale-Semantic Joint Decoupling Network for Image-text Retrieval in Remote Sensing

197

12 Dec 2022

Improving Cross-Modal Retrieval with Set of Diverse EmbeddingsComputer Vision and Pattern Recognition (CVPR), 2022

Dongwon Kim

Nam-Won Kim

Suha Kwak

531

30 Nov 2022

Dissecting Deep Metric Learning Losses for Image-Text RetrievalIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

Hong Xuan

Xi Chen

145

21 Oct 2022

Cross-modal Semantic Enhanced Interaction for Image-Sentence RetrievalIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

222

17 Oct 2022

Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation LearningNeural Information Processing Systems (NeurIPS), 2022

318

218

12 Oct 2022

Learning to embed semantic similarity for joint image-text retrievalIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

Noam Malali

Y. Keller

214

07 Oct 2022

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text RetrievalEuropean Conference on Computer Vision (ECCV), 2022

Errui Ding

Jingdong Wang

199

21 Aug 2022

Boosting Video-Text Retrieval with Explicit High-Level SemanticsACM Multimedia (ACM MM), 2022

Jungong Han

Errui Ding

227

08 Aug 2022

Intra-Modal Constraint Loss For Image-Text RetrievalInternational Conference on Information Photonics (ICIP), 2022

142

11 Jul 2022

HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval

227

24 May 2022

Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image RetrievalIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2021

Xian Sun

260

186

21 Apr 2022

Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local InformationIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022

Xian Sun

241

174

21 Apr 2022

Visual Attention Methods in Deep Learning: An In-Depth SurveyInformation Fusion (Inf. Fusion), 2022

Saeed Anwar

346

249

16 Apr 2022

ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCOEuropean Conference on Computer Vision (ECCV), 2022

1.5K

07 Apr 2022

Text2Pos: Text-to-Point-Cloud Cross-Modal LocalizationComputer Vision and Pattern Recognition (CVPR), 2022

Manuel Kolmet

Qunjie Zhou

Aljosa Osep

Laura Leal-Taixe

297

28 Mar 2022

LILE: Look In-Depth before Looking Elsewhere -- A Dual Attention Network using Transformers for Cross-Modal Information Retrieval in Histopathology ArchivesInternational Conference on Medical Imaging with Deep Learning (MIDL), 2022

Danial Maleki

H. R Tizhoosh

MedIm

221

02 Mar 2022

Auxiliary Cross-Modal Representation Learning with Triplet Loss Functions for Online Handwriting RecognitionIEEE Access (IEEE Access), 2022

Christopher Mutschler

445

16 Feb 2022

Multi-Modal Knowledge Graph Construction and Application: A SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2022

Zhixu Li

211

238

11 Feb 2022

Semantic Communications: Principles and Challenges

507

418

30 Dec 2021

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

459

06 Oct 2021

Structured Multi-modal Feature Embedding and Alignment for Image-Sentence RetrievalACM Multimedia (ACM MM), 2021

173

05 Aug 2021

Semantically Self-Aligned Network for Text-to-Image Part-aware Person Re-identification

335

226

27 Jul 2021

A Deep Local and Global Scene-Graph Matching for Image-Text RetrievalNew Trends in Software Methodologies, Tools and Techniques (TSMTT), 2021

Manh-Duy Nguyen

Binh T. Nguyen

C. Gurrin

04 Jun 2021

Survey of Visual-Semantic Embedding Methods for Zero-Shot Image RetrievalInternational Conference on Machine Learning and Applications (ICMLA), 2021

K. Ueki

260

16 May 2021

Discrete-continuous Action Space Policy Gradient-based Attention for Image-Text MatchingComputer Vision and Pattern Recognition (CVPR), 2021

Shiyang Yan

Li Yu

Yuan Xie

260

21 Apr 2021

Cross-Modal Retrieval Augmentation for Multi-Modal ClassificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Shir Gur

Natalia Neverova

C. Stauffer

Ser-Nam Lim

Douwe Kiela

A. Reiter

249

16 Apr 2021

Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval

140

29 Mar 2021

An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural InformationAAAI Conference on Artificial Intelligence (AAAI), 2021

Xuanjing Huang

173

21 Mar 2021