Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.19651
Cited By
Bridging Compressed Image Latents and Multimodal Large Language Models
29 July 2024
Chia-Hao Kao
Cheng Chien
Yu-Jen Tseng
Yi-Hsin Chen
Alessandro Gnutti
Shao-Yuan Lo
Wen-Hsiao Peng
Riccardo Leonardi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Bridging Compressed Image Latents and Multimodal Large Language Models"
6 / 6 papers shown
Title
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
Qinghao Ye
Haiyang Xu
Jiabo Ye
Mingshi Yan
Anwen Hu
Haowei Liu
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLM
VLM
116
367
0
07 Nov 2023
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
154
280
0
14 Oct 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Transformer-based Image Compression
Ming-Tse Lu
Peiyao Guo
Huiqing Shi
Chuntong Cao
Zhan Ma
ViT
44
100
0
12 Nov 2021
End-to-end Learning of Compressible Features
Saurabh Singh
Sami Abu-El-Haija
Nick Johnston
Johannes Ballé
Abhinav Shrivastava
G. Toderici
SSL
83
71
0
23 Jul 2020
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
229
74,467
0
18 May 2015
1