ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06215
  4. Cited By
A Comprehensive Survey on Cross-modal Retrieval

A Comprehensive Survey on Cross-modal Retrieval

21 July 2016
K. Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
ArXivPDFHTML

Papers citing "A Comprehensive Survey on Cross-modal Retrieval"

29 / 29 papers shown
Title
MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion
MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion
Saron Samuel
Dan DeGenaro
Jimena Guallar-Blasco
Kate Sanders
Oluwaseun Eisape
...
David Etter
Efsun Kayi
Matthew Wiesner
Kenton W. Murray
Reno Kriz
83
0
0
26 Mar 2025
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
Yu Zhang
Shutong Qiao
Jiaqi Zhang
Tzu-Heng Lin
Chen Gao
Y. Li
LM&Ro
LM&MA
87
1
0
07 Mar 2025
Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval
Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval
Jiaxing Li
Lin Jiang
Zeqi Ma
Kaihang Jiang
Xiaozhao Fang
Jie Wen
33
0
0
27 Feb 2025
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
Xin Zhang
Yanzhao Zhang
Wen Xie
Mingxin Li
Ziqi Dai
Dingkun Long
Pengjun Xie
Meishan Zhang
Wenjie Li
M. Zhang
116
7
0
22 Dec 2024
Adversarial Hubness in Multi-Modal Retrieval
Adversarial Hubness in Multi-Modal Retrieval
Tingwei Zhang
Fnu Suya
Rishi Jha
Collin Zhang
Vitaly Shmatikov
AAML
83
1
0
18 Dec 2024
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
Jiaqi Wang
Yifei Gao
Jitao Sang
MLLM
121
2
0
24 Nov 2024
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
Hao Fang
Jiawei Kong
Wenbo Yu
Bin Chen
Jiawei Li
Hao Wu
Ke Xu
Ke Xu
AAML
VLM
40
13
0
08 Jun 2024
Synaptic Plasticity Models and Bio-Inspired Unsupervised Deep Learning:
  A Survey
Synaptic Plasticity Models and Bio-Inspired Unsupervised Deep Learning: A Survey
Gabriele Lagani
Fabrizio Falchi
Claudio Gennaro
Giuseppe Amato
AAML
33
6
0
30 Jul 2023
CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Structure for Vision-Language Retrieval
Yang Yang
Zhongtian Fu
Xiangyu Wu
Wenjie Li
VLM
21
1
0
15 Apr 2023
Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A
  Reproducibility Study
Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study
Mariya Hendriksen
Svitlana Vakulenko
E. Kuiper
Maarten de Rijke
28
5
0
12 Jan 2023
What do you MEME? Generating Explanations for Visual Semantic Role
  Labelling in Memes
What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes
Shivam Sharma
Siddhant Agarwal
Tharun Suresh
Preslav Nakov
Md. Shad Akhtar
Tanmoy Charkraborty
VLM
20
18
0
01 Dec 2022
Information-Theoretic Hashing for Zero-Shot Cross-Modal Retrieval
Information-Theoretic Hashing for Zero-Shot Cross-Modal Retrieval
Yufeng Shi
Shujian Yu
Duanquan Xu
Xinge You
26
1
0
26 Sep 2022
Multimodal Lecture Presentations Dataset: Understanding Multimodality in
  Educational Slides
Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides
Dong Won Lee
Chaitanya Ahuja
Paul Pu Liang
Sanika Natu
Louis-Philippe Morency
15
7
0
17 Aug 2022
Debiased Cross-modal Matching for Content-based Micro-video Background
  Music Recommendation
Debiased Cross-modal Matching for Content-based Micro-video Background Music Recommendation
Jin Yi
Zhenzhong Chen
33
1
0
07 Aug 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid
  Counterfactual Training for Robust Content-based Image Retrieval
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
47
6
0
09 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context
  Augmented Dialogue System: A Review
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
27
2
0
02 Jul 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
212
0
18 Feb 2022
Achieving Human Parity on Visual Question Answering
Achieving Human Parity on Visual Question Answering
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
...
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
24
12
0
17 Nov 2021
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval
  via Cross-modal Pretraining
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining
Xunlin Zhan
Yangxin Wu
Xiao Dong
Yunchao Wei
Minlong Lu
Yichi Zhang
Hang Xu
Xiaodan Liang
ViT
21
64
0
30 Jul 2021
Graph Pattern Loss based Diversified Attention Network for Cross-Modal
  Retrieval
Graph Pattern Loss based Diversified Attention Network for Cross-Modal Retrieval
Xueying Chen
Rong Zhang
Yibing Zhan
19
0
0
25 Jun 2021
Cross-Modal and Multimodal Data Analysis Based on Functional Mapping of
  Spectral Descriptors and Manifold Regularization
Cross-Modal and Multimodal Data Analysis Based on Functional Mapping of Spectral Descriptors and Manifold Regularization
M. Behmanesh
Peyman Adibi
Jocelyn Chanussot
Sayyed Mohammad Saeed Ehsani
24
2
0
12 May 2021
Person Retrieval in Surveillance Using Textual Query: A Review
Person Retrieval in Surveillance Using Textual Query: A Review
Hiren J. Galiyawala
M. Raval
23
11
0
06 May 2021
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei-Neng Chen
Weiping Wang
Li Liu
M. Lew
VLM
110
31
0
16 Oct 2020
From Text to Sound: A Preliminary Study on Retrieving Sound Effects to
  Radio Stories
From Text to Sound: A Preliminary Study on Retrieving Sound Effects to Radio Stories
Songwei Ge
Curtis Xuan
Ruihua Song
Chao Zou
Wei Liu
Jin Zhou
6
2
0
20 Aug 2019
Beyond Intra-modality: A Survey of Heterogeneous Person
  Re-identification
Beyond Intra-modality: A Survey of Heterogeneous Person Re-identification
Zheng Wang
Zhixiang Wang
Yinqiang Zheng
Yang Wu
Wenjun Zeng
Shiníchi Satoh
19
63
0
24 May 2019
Semi-supervised Deep Representation Learning for Multi-View Problems
Semi-supervised Deep Representation Learning for Multi-View Problems
Vahid Noroozi
S. Bahaadini
Lei Zheng
Sihong Xie
Weixiang Shao
Philip S. Yu
14
15
0
11 Nov 2018
Dense Multimodal Fusion for Hierarchically Joint Representation
Dense Multimodal Fusion for Hierarchically Joint Representation
Di Hu
Feiping Nie
Xuelong Li
16
43
0
08 Oct 2018
HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal
  Retrieval
HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval
Xi Zhang
Siyu Zhou
Jiashi Feng
Hanjiang Lai
Bo Li
Yan Pan
Jian Yin
Shuicheng Yan
GAN
24
55
0
26 Nov 2017
A Multi-View Embedding Space for Modeling Internet Images, Tags, and
  their Semantics
A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics
Yunchao Gong
Qifa Ke
Michael Isard
Svetlana Lazebnik
3DV
60
584
0
18 Dec 2012
1