ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.00208
  4. Cited By
e-CLIP: Large-Scale Vision-Language Representation Learning in
  E-commerce

e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce

1 July 2022
Wonyoung Shin
Jonghun Park
Taekang Woo
Yongwoo Cho
Kwangjin Oh
Hwanjun Song
    VLM
ArXivPDFHTML

Papers citing "e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce"

9 / 9 papers shown
Title
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
Junjie Wu
Jiangtao Xie
Zhaolin Zhang
Qilong Wang
Q. Hu
P. Li
Sen Xu
VLM
39
0
0
02 Apr 2025
Multi-Modality Transformer for E-Commerce: Inferring User Purchase Intention to Bridge the Query-Product Gap
Srivatsa Mallapragada
Ying Xie
Varsha Rani Chawan
Zeyad Hailat
Yuanbo Wang
36
0
0
28 Jan 2025
Semantic Shield: Defending Vision-Language Models Against Backdooring
  and Poisoning via Fine-grained Knowledge Alignment
Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment
Alvi Md Ishmam
Christopher Thomas
AAML
114
3
0
23 Nov 2024
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
F. Liu
Delong Chen
Zhan-Rong Guan
Xiaocong Zhou
Jiale Zhu
Qiaolin Ye
Liyong Fu
Jun Zhou
VLM
68
190
0
19 Jun 2023
VITR: Augmenting Vision Transformers with Relation-Focused Learning for
  Cross-Modal Information Retrieval
VITR: Augmenting Vision Transformers with Relation-Focused Learning for Cross-Modal Information Retrieval
Yansong Gong
Georgina Cosma
Axel Finke
ViT
28
2
0
13 Feb 2023
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
Hwanjun Song
Deqing Sun
Sanghyuk Chun
Varun Jampani
Dongyoon Han
Byeongho Heo
Wonjae Kim
Ming-Hsuan Yang
83
76
0
08 Oct 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,689
0
11 Feb 2021
Large-Scale Training System for 100-Million Classification at Alibaba
Large-Scale Training System for 100-Million Classification at Alibaba
Liuyihan Song
Pan Pan
Kang Zhao
Hao Yang
Yiming Chen
Yingya Zhang
Yinghui Xu
R. L. Jin
32
23
0
09 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
525
0
04 Feb 2021
1