Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.08784
Cited By
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
16 March 2021
Siqi Sun
Yen-Chun Chen
Linjie Li
Shuohang Wang
Yuwei Fang
Jingjing Liu
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval"
12 / 12 papers shown
Title
Towards Fast and Accurate Image-Text Retrieval with Self-Supervised Fine-Grained Alignment
Jiamin Zhuang
Jing Yu
Yang Ding
Xiangyang Qu
Yue Hu
19
9
0
27 Aug 2023
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval
Ziyang Luo
Pu Zhao
Can Xu
Xiubo Geng
Tao Shen
Chongyang Tao
Jing Ma
Qingwen Lin
Daxin Jiang
VLM
CLIP
17
3
0
06 Feb 2023
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Zhi-Qi Cheng
Qianwen Dai
Siyao Li
Teruko Mitamura
Alexander G. Hauptmann
16
34
0
18 Aug 2022
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval
Feilong Chen
Xiuyi Chen
Jiaxin Shi
Duzhen Zhang
Jianlong Chang
Qi Tian
VLM
CLIP
32
6
0
24 May 2022
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval
Haoyu Lu
Nanyi Fei
Yuqi Huo
Yizhao Gao
Zhiwu Lu
Jiaxin Wen
CLIP
VLM
19
54
0
15 Apr 2022
Vision-and-Language Pretrained Models: A Survey
Siqu Long
Feiqi Cao
S. Han
Haiqing Yang
VLM
16
63
0
15 Apr 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
35
86
0
14 Feb 2022
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Tsu-jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
W. Wang
Lijuan Wang
Zicheng Liu
VLM
34
216
0
24 Nov 2021
Text-Based Person Search with Limited Data
Xiaoping Han
Sen He
Li Zhang
Tao Xiang
8
88
0
20 Oct 2021
AliMe MKG: A Multi-modal Knowledge Graph for Live-streaming E-commerce
Guohai Xu
Hehong Chen
Feng-Lin Li
Fu Sun
Yunzhou Shi
Zhixiong Zeng
Wei Zhou
Zhongzhou Zhao
Ji Zhang
12
16
0
13 Sep 2021
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong-jin Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
C. Miao
Houqiang Li
28
41
0
19 Apr 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
927
0
24 Sep 2019
1