Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.13566
Cited By
MLLM-DataEngine: An Iterative Refinement Approach for MLLM
25 August 2023
Zhiyuan Zhao
Linke Ouyang
Bin Wang
Siyuan Huang
Pan Zhang
Xiao-wen Dong
Jiaqi Wang
Conghui He
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MLLM-DataEngine: An Iterative Refinement Approach for MLLM"
8 / 8 papers shown
Title
Bridging Writing Manner Gap in Visual Instruction Tuning by Creating LLM-aligned Instructions
Dong Jing
Nanyi Fei
Zhiwu Lu
36
0
0
24 Mar 2025
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
Yuhang Cao
Pan Zhang
Xiao-wen Dong
Dahua Lin
Jiaqi Wang
27
10
0
22 Feb 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
73
89
0
29 Jan 2024
Vision-Language Instruction Tuning: A Review and Analysis
Chen Li
Yixiao Ge
Dian Li
Ying Shan
VLM
22
12
0
14 Nov 2023
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Pan Zhang
Xiaoyi Wang
Bin Wang
Yuhang Cao
Chao Xu
...
Conghui He
Xingcheng Zhang
Yu Qiao
Da Lin
Jiaqi Wang
MLLM
27
222
0
26 Sep 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
845
0
17 Feb 2021
1