Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.10401
Cited By
Vision-Language Pre-Training with Triple Contrastive Learning
21 February 2022
Jinyu Yang
Jiali Duan
Son N. Tran
Yi Xu
Sampath Chanda
Liqun Chen
Belinda Zeng
Trishul M. Chilimbi
Junzhou Huang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision-Language Pre-Training with Triple Contrastive Learning"
33 / 33 papers shown
Title
Automated Data Curation Using GPS & NLP to Generate Instruction-Action Pairs for Autonomous Vehicle Vision-Language Navigation Datasets
Guillermo Roque
Erika Maquiling
Jose Giovanni Tapia Lopez
Ross Greer
68
0
0
06 May 2025
RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings
A. Dhakal
S. Sastry
Subash Khanal
Adeel Ahmad
Eric Xing
Nathan Jacobs
48
0
0
27 Feb 2025
Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks
Hunmin Yang
Jongoh Jeong
Kuk-Jin Yoon
AAML
VLM
58
4
0
30 Jul 2024
WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting
Jingjing Wu
Zhengyao Fang
Pengyuan Lyu
Chengquan Zhang
Fanglin Chen
Guangming Lu
Wenjie Pei
47
2
0
28 Jul 2024
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
Hao Fang
Jiawei Kong
Wenbo Yu
Bin Chen
Jiawei Li
Hao Wu
Ke Xu
Ke Xu
AAML
VLM
30
13
0
08 Jun 2024
Effectiveness Assessment of Recent Large Vision-Language Models
Yao Jiang
Xinyu Yan
Ge-Peng Ji
Keren Fu
Meijun Sun
Huan Xiong
Deng-Ping Fan
Fahad Shahbaz Khan
27
14
0
07 Mar 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLM
CLIP
62
12
0
05 Mar 2024
Convincing Rationales for Visual Question Answering Reasoning
Kun Li
G. Vosselman
Michael Ying Yang
34
1
0
06 Feb 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGe
VLM
49
4
0
28 Dec 2023
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Bang-ju Yang
Fenglin Liu
X. Wu
Yaowei Wang
Xu Sun
Yuexian Zou
VLM
CLIP
22
13
0
25 Aug 2023
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen
Longteng Guo
Jianxiang Sun
Shuai Shao
Zehuan Yuan
Liang Lin
Dongyu Zhang
MLLM
VLM
MoE
43
9
0
23 Aug 2023
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models
Dong Lu
Zhiqiang Wang
Teng Wang
Weili Guan
Hongchang Gao
Feng Zheng
AAML
46
64
0
26 Jul 2023
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
Chunhui Zhang
Xin Sun
Li Liu
Yiqian Yang
Qiong Liu
Xiaoping Zhou
Yanfeng Wang
33
15
0
07 Jul 2023
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
Chia-Wen Kuo
Z. Kira
25
21
0
25 May 2023
Detecting and Grounding Multi-Modal Media Manipulation
Rui Shao
Tianxing Wu
Ziwei Liu
24
57
0
05 Apr 2023
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Yash J. Patel
Yusheng Xie
Yi Zhu
Srikar Appalaraju
R. Manmatha
19
4
0
07 Feb 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Yucheng Zhou
Guodong Long
25
22
0
26 Jan 2023
Fine-Grained Distillation for Long Document Retrieval
Yucheng Zhou
Tao Shen
Xiubo Geng
Chongyang Tao
Guodong Long
Can Xu
Daxin Jiang
RALM
11
28
0
20 Dec 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
27
2
0
27 Nov 2022
MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Zijia Zhao
Longteng Guo
Xingjian He
Shuai Shao
Zehuan Yuan
Jing Liu
16
8
0
09 Oct 2022
VLMAE: Vision-Language Masked Autoencoder
Su He
Taian Guo
Tao Dai
Ruizhi Qiao
Chen Wu
Xiujun Shu
Bohan Ren
VLM
16
11
0
19 Aug 2022
Masked Vision and Language Modeling for Multi-modal Representation Learning
Gukyeong Kwon
Zhaowei Cai
Avinash Ravichandran
Erhan Bas
Rahul Bhotika
Stefano Soatto
22
66
0
03 Aug 2022
Augmenting Vision Language Pretraining by Learning Codebook with Visual Semantics
Xiaoyuan Guo
Jiali Duan
C.-C. Jay Kuo
J. Gichoya
Imon Banerjee
VLM
14
1
0
31 Jul 2022
Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models
Yi Zhang
Junyan Wang
Jitao Sang
14
27
0
03 Jul 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
522
0
13 Jun 2022
Prefix Conditioning Unifies Language and Label Supervision
Kuniaki Saito
Kihyuk Sohn
X. Zhang
Chun-Liang Li
Chen-Yu Lee
Kate Saenko
Tomas Pfister
VLM
CLIP
25
16
0
02 Jun 2022
Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Zaid Khan
B. Vijaykumar
Xiang Yu
S. Schulter
Manmohan Chandraker
Y. Fu
CLIP
VLM
20
16
0
27 Mar 2022
Multi-modal Alignment using Representation Codebook
Jiali Duan
Liqun Chen
Son Tran
Jinyu Yang
Yi Xu
Belinda Zeng
Trishul M. Chilimbi
19
66
0
28 Feb 2022
A Survey of Vision-Language Pre-Trained Models
Yifan Du
Zikang Liu
Junyi Li
Wayne Xin Zhao
VLM
15
179
0
18 Feb 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
211
0
18 Feb 2022
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
1,077
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
238
3,359
0
09 Mar 2020
1