Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.02096
Cited By
Compressing Visual-linguistic Model via Knowledge Distillation
5 April 2021
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Compressing Visual-linguistic Model via Knowledge Distillation"
24 / 24 papers shown
Title
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via
D
\mathbf{\texttt{D}}
D
ual-
H
\mathbf{\texttt{H}}
H
ead
O
\mathbf{\texttt{O}}
O
ptimization
Seongjae Kang
Dong Bok Lee
Hyungjoon Jang
Sung Ju Hwang
VLM
38
0
0
12 May 2025
Platonic Grounding for Efficient Multimodal Language Models
Moulik Choraria
Xinbo Wu
Akhil Bhimaraju
Nitesh Sekhar
Yue Wu
Xu Zhang
Prateek Singhal
L. Varshney
54
0
0
27 Apr 2025
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Haicheng Wang
Zhemeng Yu
Gabriele Spadaro
Chen Ju
Victor Quétu
Enzo Tartaglione
Enzo Tartaglione
VLM
97
3
0
05 Jan 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao
Shiqian Su
X. Zhu
Chenyu Zhang
Zhe Chen
...
Wenhai Wang
Lewei Lu
Gao Huang
Yu Qiao
Jifeng Dai
MLLM
VLM
102
2
0
20 Dec 2024
CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance
Chu Myaet Thwal
Ye Lin Tun
Minh N. H. Nguyen
Eui-nam Huh
Choong Seon Hong
VLM
74
0
0
05 Dec 2024
Revisiting Prompt Pretraining of Vision-Language Models
Zhenyuan Chen
Lingfeng Yang
Shuo Chen
Zhaowei Chen
Jiajun Liang
Xiang Li
MLLM
VPVLM
VLM
38
1
0
10 Sep 2024
TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training
Chaoya Jiang
Wei Ye
Haiyang Xu
Qinghao Ye
Mingshi Yan
Ji Zhang
Shikun Zhang
CLIP
VLM
14
4
0
14 Dec 2023
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIP
VLM
42
0
0
10 Jul 2023
DiffCap: Exploring Continuous Diffusion on Image Captioning
Yufeng He
Zefan Cai
Xu Gan
Baobao Chang
DiffM
21
5
0
20 May 2023
SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification
Fang Peng
Xiaoshan Yang
Linhui Xiao
Yaowei Wang
Changsheng Xu
VLM
22
42
0
28 Nov 2022
DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization
Chaoli Wang
J. Han
28
36
0
13 Apr 2022
Geographic Adaptation of Pretrained Language Models
Valentin Hofmann
Goran Glavavs
Nikola Ljubevsić
J. Pierrehumbert
Hinrich Schütze
VLM
21
16
0
16 Mar 2022
A Survey of Vision-Language Pre-Trained Models
Yifan Du
Zikang Liu
Junyi Li
Wayne Xin Zhao
VLM
24
179
0
18 Feb 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
212
0
18 Feb 2022
Cross-modal Contrastive Distillation for Instructional Activity Anticipation
Zhengyuan Yang
Jingen Liu
Jing-ling Huang
Xiaodong He
Tao Mei
Chenliang Xu
Jiebo Luo
14
6
0
18 Jan 2022
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
19
86
0
09 Dec 2021
Playing Lottery Tickets with Vision and Language
Zhe Gan
Yen-Chun Chen
Linjie Li
Tianlong Chen
Yu Cheng
Shuohang Wang
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
101
53
0
23 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,764
0
24 Feb 2021
SEED: Self-supervised Distillation For Visual Representation
Zhiyuan Fang
Jianfeng Wang
Lijuan Wang
Lei Zhang
Yezhou Yang
Zicheng Liu
SSL
231
190
0
12 Jan 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
252
157
0
02 Jan 2021
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
926
0
24 Sep 2019
Weakly Supervised Attention Learning for Textual Phrases Grounding
Zhiyuan Fang
Shu Kong
Tianshu Yu
Yezhou Yang
17
12
0
01 May 2018
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
268
10,214
0
16 Nov 2016
1