ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.12526
  4. Cited By
Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning

Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning

26 August 2022
Yabing Wang
Jianfeng Dong
Tianxiang Liang
Minsong Zhang
Rui Cai
Xun Wang
ArXivPDFHTML

Papers citing "Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning"

9 / 9 papers shown
Title
From Mapping to Composing: A Two-Stage Framework for Zero-shot Composed Image Retrieval
From Mapping to Composing: A Two-Stage Framework for Zero-shot Composed Image Retrieval
Yabing Wang
Zhuotao Tian
Qingpei Guo
Zheng Qin
Sanping Zhou
Ming Yang
Le Wang
64
0
0
25 Apr 2025
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual
  Captioning
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Bang-ju Yang
Fenglin Liu
X. Wu
Yaowei Wang
Xu Sun
Yuexian Zou
VLM
CLIP
22
13
0
25 Aug 2023
From Region to Patch: Attribute-Aware Foreground-Background Contrastive
  Learning for Fine-Grained Fashion Retrieval
From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval
Jianfeng Dong
Xi Peng
Zhe Ma
Daizong Liu
Xiaoye Qu
Xun Yang
Jixiang Zhu
Baolong Liu
10
12
0
17 May 2023
Video and Text Matching with Conditioned Embeddings
Video and Text Matching with Conditioned Embeddings
Ameen Ali
Idan Schwartz
Tamir Hazan
Lior Wolf
65
13
0
21 Oct 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
309
778
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
410
594
0
21 Jul 2020
Word Translation Without Parallel Data
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
165
1,634
0
11 Oct 2017
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,196
0
16 Nov 2016
1