ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.04474
  4. Cited By
Vision Language Pre-training by Contrastive Learning with Cross-Modal
  Similarity Regulation

Vision Language Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation

8 May 2023
Chaoya Jiang
Wei Ye
Haiyang Xu
Miang yan
Shikun Zhang
Jie Zhang
Fei Huang
    VLM
ArXivPDFHTML

Papers citing "Vision Language Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation"

14 / 14 papers shown
Title
Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via
  Semantics Completion and Decomposition
Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
Daiqing Wu
Dongbao Yang
Huawen Shen
Can Ma
Yu Zhou
26
2
0
09 Jul 2024
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation
  Framework for Large Vision Language Models
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models
Chaoya Jiang
Wei Ye
Mengfan Dong
Hongrui Jia
Haiyang Xu
Mingshi Yan
Ji Zhang
Shikun Zhang
VLM
MLLM
29
15
0
24 Feb 2024
Improving the Robustness of Knowledge-Grounded Dialogue via Contrastive
  Learning
Improving the Robustness of Knowledge-Grounded Dialogue via Contrastive Learning
Jiaan Wang
Jianfeng Qu
Kexin Wang
Zhixu Li
Wen Hua
Ximing Li
An Liu
12
2
0
09 Jan 2024
TiMix: Text-aware Image Mixing for Effective Vision-Language
  Pre-training
TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training
Chaoya Jiang
Wei Ye
Haiyang Xu
Qinghao Ye
Mingshi Yan
Ji Zhang
Shikun Zhang
CLIP
VLM
11
4
0
14 Dec 2023
Hallucination Augmented Contrastive Learning for Multimodal Large
  Language Model
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Chaoya Jiang
Haiyang Xu
Mengfan Dong
Jiaxing Chen
Wei Ye
Mingshi Yan
Qinghao Ye
Ji Zhang
Fei Huang
Shikun Zhang
VLM
11
51
0
12 Dec 2023
X-InstructBLIP: A Framework for aligning X-Modal instruction-aware
  representations to LLMs and Emergent Cross-modal Reasoning
X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning
Artemis Panagopoulou
Le Xue
Ning Yu
Junnan Li
Dongxu Li
Shafiq R. Joty
Ran Xu
Silvio Savarese
Caiming Xiong
Juan Carlos Niebles
VLM
MLLM
28
45
0
30 Nov 2023
BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up
  Patch Summarization
BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization
Chaoya Jiang
Haiyang Xu
Wei Ye
Qinghao Ye
Chenliang Li
Mingshi Yan
Bin Bi
Shikun Zhang
Fei Huang
Songfang Huang
VLM
13
9
0
17 Jul 2023
Exploiting Pseudo Image Captions for Multimodal Summarization
Exploiting Pseudo Image Captions for Multimodal Summarization
Chaoya Jiang
Rui Xie
Wei Ye
Jinan Sun
Shikun Zhang
VLM
18
13
0
09 May 2023
Modeling Paragraph-Level Vision-Language Semantic Alignment for
  Multi-Modal Summarization
Modeling Paragraph-Level Vision-Language Semantic Alignment for Multi-Modal Summarization
Chenhao Cui
Xinnian Liang
Shuangzhi Wu
Zhoujun Li
23
3
0
24 Aug 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
182
403
0
13 Jul 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
252
157
0
02 Jan 2021
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
235
3,029
0
09 Mar 2020
1