Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.04275
Cited By
M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining
9 September 2021
Xiao Dong
Xunlin Zhan
Yangxin Wu
Yunchao Wei
Michael C. Kampffmeyer
Xiaoyong Wei
Minlong Lu
Yaowei Wang
Xiaodan Liang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining"
18 / 18 papers shown
Title
Code and Pixels: Multi-Modal Contrastive Pre-training for Enhanced Tabular Data Analysis
Kankana Roy
Lars Krämer
Sebastian Domaschke
Malik Haris
Roland Aydin
Fabian Isensee
Martin Held
38
0
0
13 Jan 2025
DiffCLIP: Few-shot Language-driven Multimodal Classifier
Jiaqing Zhang
Mingxiang Cao
Xue Yang
Kai Jiang
Yunsong Li
VLM
66
0
0
10 Dec 2024
ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval
Ruixiang Zhao
Jian Jia
Yan Li
Xuehan Bai
Quan Chen
Han Li
Peng Jiang
Xirong Li
28
0
0
06 Aug 2024
Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval
Yongchao Du
Min Wang
Wen-gang Zhou
Shuping Hui
Houqiang Li
21
10
0
03 Mar 2024
Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding
Yatong Bai
Utsav Garg
Apaar Shanker
Haoming Zhang
Samyak Parajuli
...
Eugenia D Fomitcheva
E. Branson
Aerin Kim
Somayeh Sojoudi
Kyunghyun Cho
11
2
0
09 Jan 2024
SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training
Kazem Meidani
Parshin Shojaee
Chandan K. Reddy
A. Farimani
18
18
0
03 Oct 2023
Training with Product Digital Twins for AutoRetail Checkout
Yue Yao
Xinyu Tian
Zhenghang Tang
Sujit Biswas
Huan Lei
Tom Gedeon
Liang Zheng
11
2
0
18 Aug 2023
Cross-Domain Product Representation Learning for Rich-Content E-Commerce
Xuehan Bai
Yan Li
Yong Cheng
Wenjie Yang
Quanming Chen
Han Li
11
2
0
10 Aug 2023
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
F. Liu
Delong Chen
Zhan-Rong Guan
Xiaocong Zhou
Jiale Zhu
Qiaolin Ye
Liyong Fu
Jun Zhou
VLM
66
188
0
19 Jun 2023
COURIER: Contrastive User Intention Reconstruction for Large-Scale Visual Recommendation
Jia-Qi Yang
Chen Dai
OU Dan
Dongshuai Li
Ju Huang
De-Chuan Zhan
Xiaoyi Zeng
Yang Yang
12
1
0
08 Jun 2023
UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning
Xiao Dong
Runhu Huang
Xiaoyong Wei
Zequn Jie
Jianxing Yu
Jian Yin
Xiaodan Liang
VLM
DiffM
26
1
0
01 Jun 2023
Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Yang Jin
Yongzhi Li
Zehuan Yuan
Yadong Mu
11
7
0
06 Apr 2023
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Xiao Wang
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CE
VLM
24
195
0
20 Feb 2023
Construction and Applications of Billion-Scale Pre-Trained Multimodal Business Knowledge Graph
Shumin Deng
Chengming Wang
Zhoubo Li
Ningyu Zhang
Zelin Dai
...
Mosha Chen
Jiaoyan Chen
Jeff Z. Pan
Bryan Hooi
Huajun Chen
VLM
13
20
0
30 Sep 2022
Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval
Xiao Dong
Xunlin Zhan
Yunchao Wei
Xiaoyong Wei
Yaowei Wang
Minlong Lu
Xiaochun Cao
Xiaodan Liang
19
11
0
17 Jun 2022
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
1,077
0
17 Feb 2021
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
401
594
0
21 Jul 2020
1