Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.14572
Cited By
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining
30 July 2021
Xunlin Zhan
Yangxin Wu
Xiao Dong
Yunchao Wei
Minlong Lu
Yichi Zhang
Hang Xu
Xiaodan Liang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining"
27 / 27 papers shown
Title
CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation
Yuxuan Wang
Yijun Liu
Fei Yu
Chen Huang
Kexin Li
Zhiguo Wan
Wanxiang Che
VLM
CoGe
27
5
0
01 Jul 2024
Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting Process: Methodology and Benchmark
Gaochang Wu
Yapeng Zhang
Lan Deng
Jingxin Zhang
Tianyou Chai
31
1
0
13 Jun 2024
Benchmarking Vision-Language Contrastive Methods for Medical Representation Learning
Shuvendu Roy
Yasaman Parhizkar
Franklin Ogidi
Vahid Reza Khazaie
Michael Colacci
Ali Etemad
Elham Dolatabadi
Arash Afkanpour
VLM
35
1
0
11 Jun 2024
Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval
Yongchao Du
Min Wang
Wen-gang Zhou
Shuping Hui
Houqiang Li
27
10
0
03 Mar 2024
FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion
Xing Han
Huy Nguyen
Carl Harris
Nhat Ho
S. Saria
MoE
69
16
0
05 Feb 2024
CBVS: A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search Scenarios
Xiangshuo Qiao
Xianxin Li
Xiaozhe Qu
Jie M. Zhang
Yang Liu
Yu Luo
Cihang Jin
Jin Ma
VLM
18
0
0
19 Jan 2024
Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation
Wenting Chen
Linlin Shen
Jingyang Lin
Jiebo Luo
Xiang Li
Yixuan Yuan
MedIm
10
10
0
13 Dec 2023
Modality-aware Transformer for Financial Time series Forecasting
Hajar Emami
Xuan-Hong Dang
Yousaf Shah
Petros Zerfos
AI4TS
21
0
0
02 Oct 2023
Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features
Alberto Baldrati
Marco Bertini
Tiberio Uricchio
A. Bimbo
CLIP
CoGe
11
28
0
22 Aug 2023
Training with Product Digital Twins for AutoRetail Checkout
Yue Yao
Xinyu Tian
Zhenghang Tang
Sujit Biswas
Huan Lei
Tom Gedeon
Liang Zheng
11
2
0
18 Aug 2023
CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation
Hongguang Zhu
Yunchao Wei
Xiaodan Liang
Chunjie Zhang
Yao-Min Zhao
VLM
27
26
0
14 Aug 2023
Cross-Domain Product Representation Learning for Rich-Content E-Commerce
Xuehan Bai
Yan Li
Yong Cheng
Wenjie Yang
Quanming Chen
Han Li
11
2
0
10 Aug 2023
LRVS-Fashion: Extending Visual Search with Referring Instructions
Simon Lepage
Jérémie Mary
David Picard
18
1
0
05 Jun 2023
UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning
Xiao Dong
Runhu Huang
Xiaoyong Wei
Zequn Jie
Jianxing Yu
Jian Yin
Xiaodan Liang
VLM
DiffM
26
1
0
01 Jun 2023
Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Yang Jin
Yongzhi Li
Zehuan Yuan
Yadong Mu
11
13
0
06 Apr 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
35
46
0
21 Mar 2023
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Xiao Wang
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CE
VLM
24
195
0
20 Feb 2023
Transformadores: Fundamentos teoricos y Aplicaciones
J. D. L. Torre
60
0
0
18 Feb 2023
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications
Muhammad Arslan Manzoor
S. Albarri
Ziting Xian
Zaiqiao Meng
Preslav Nakov
Shangsong Liang
AI4TS
21
26
0
01 Feb 2023
Self-Contained Entity Discovery from Captioned Videos
M. Ayoughi
P. Mettes
Paul T. Groth
18
2
0
13 Aug 2022
A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval
Alex Falcon
G. Serra
O. Lanz
VGen
26
25
0
03 Aug 2022
Adaptive Multi-view Rule Discovery for Weakly-Supervised Compatible Products Prediction
Rongzhi Zhang
Rebecca West
Xiquan Cui
Chao Zhang
19
6
0
28 Jun 2022
Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval
Xiao Dong
Xunlin Zhan
Yunchao Wei
Xiaoyong Wei
Yaowei Wang
Minlong Lu
Xiaochun Cao
Xiaodan Liang
19
11
0
17 Jun 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
518
0
13 Jun 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
24
86
0
14 Feb 2022
FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao
Runhu Huang
Lu Hou
Guansong Lu
Minzhe Niu
Hang Xu
Xiaodan Liang
Zhenguo Li
Xin Jiang
Chunjing Xu
VLM
CLIP
19
610
0
09 Nov 2021
M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining
Xiao Dong
Xunlin Zhan
Yangxin Wu
Yunchao Wei
Michael C. Kampffmeyer
Xiaoyong Wei
Minlong Lu
Yaowei Wang
Xiaodan Liang
25
36
0
09 Sep 2021
1