Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.12902
Cited By
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
23 August 2024
Bin Wang
Chunyu Xie
Dawei Leng
Yuhui Yin
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities"
4 / 4 papers shown
Title
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
36
0
0
08 May 2025
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
Hao Zhang
Hongyang Li
Feng Li
Tianhe Ren
Xueyan Zou
...
Shijia Huang
Jianfeng Gao
Lei Zhang
Chun-yue Li
Jianwei Yang
87
68
0
05 Dec 2023
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
Qinghao Ye
Haiyang Xu
Jiabo Ye
Mingshi Yan
Anwen Hu
Haowei Liu
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLM
VLM
114
367
0
07 Nov 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
1