ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.16640
  4. Cited By
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
v1v2 (latest)

A Survey of Multimodal Large Language Model from A Data-centric Perspective

26 May 2024
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
Shiyu Li
Ling Yang
Bozhou Li
Yifan Wang
Tengjiao Wang
Ping Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
ArXiv (abs)PDFHTML

Papers citing "A Survey of Multimodal Large Language Model from A Data-centric Perspective"

41 / 41 papers shown
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
Shaobo Wang
Tianle Niu
Runkang Yang
Deshan Liu
Xu He
Zichen Wen
Conghui He
Xuming Hu
Linfeng Zhang
VGen
216
1
0
24 Nov 2025
From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models
From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models
Wenxin Zhu
Andong Chen
Yuchen Song
Kehai Chen
Conghui Zhu
Ziyan Chen
Tiejun Zhao
LRM
490
0
0
17 Nov 2025
An item is worth one token in Multimodal Large Language Models-based Sequential Recommendation
An item is worth one token in Multimodal Large Language Models-based Sequential Recommendation
Qiyong Zhong
Jiajie Su
Ming Yang
Yunshan Ma
Xiaolin Zheng
Chaochao Chen
263
0
0
08 Nov 2025
Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Shrestha Datta
Shahriar Kabir Nahin
Anshuman Chhabra
P. Mohapatra
LLMAGLM&Ro
392
7
0
27 Oct 2025
Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input
Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input
Chenxu Li
Zhicai Wang
Yuan Sheng
Xingyu Zhu
Y. Hao
Xiang Wang
AAML
245
0
0
19 Oct 2025
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
Jiancheng Zhang
Yinglun Zhu
217
1
0
25 Sep 2025
TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training
TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training
Yifan Wang
Binbin Liu
Fengze Liu
Yuanfan Guo
Jiyao Deng
Xuecheng Wu
Weidong Zhou
Xiaohuan Zhou
Taifeng Wang
153
0
0
25 Aug 2025
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
Zijian Wu
Jinjie Ni
Xiangyan Liu
Zichen Liu
Hang Yan
Michael Shieh
OffRLReLMLRM
214
5
0
02 Jun 2025
ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models
ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models
Bozhou Li
Wentao Zhang
VLM
197
1
0
27 May 2025
Flex-Judge: Text-Only Reasoning Unleashes Zero-Shot Multimodal Evaluators
Flex-Judge: Text-Only Reasoning Unleashes Zero-Shot Multimodal Evaluators
Jongwoo Ko
S. Kim
Sungwoo Cho
Se-Young Yun
ELMLRM
595
0
0
24 May 2025
Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning
Cheng Peng
Kai Zhang
Mengxian Lyu
Hongfang Liu
Lichao Sun
Yonghui Wu
LM&MAMedImVLM
500
3
0
23 May 2025
Multimodal Agricultural Agent Architecture (MA3): A New Paradigm for Intelligent Agricultural Decision-Making
Multimodal Agricultural Agent Architecture (MA3): A New Paradigm for Intelligent Agricultural Decision-Making
Zhuoning Xu
Jian Xu
Hao Fei
Peijie Wang
Chao Deng
Cheng-Lin Liu
302
1
0
07 Apr 2025
Unicorn: Text-Only Data Synthesis for Vision Language Model Training
Unicorn: Text-Only Data Synthesis for Vision Language Model Training
Xiaomin Yu
Pengxiang Ding
Donglin Wang
Siteng Huang
Songyang Gao
Chengwei Qin
Kejian Wu
Zhaoxin Fan
Ziyue Qiao
Donglin Wang
MLLMSyDa
296
3
0
28 Mar 2025
Do Multimodal Large Language Models Understand Welding?
Do Multimodal Large Language Models Understand Welding?Information Fusion (Inf. Fusion), 2025
Grigorii Khvatskii
Yong Suk Lee
Corey Angst
Maria Gibbs
Robert Landers
Nitesh Chawla
AI4CE
263
3
0
18 Mar 2025
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search
Yiming Jia
Junlong Li
Xiang Yue
Bo Li
Ping Nie
Dayou Du
Lei Ma
LRM
510
20
0
13 Mar 2025
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual ContextsComputer Vision and Pattern Recognition (CVPR), 2025
Peijie Wang
Zhong-Zhi Li
Fei Yin
Xin Yang
Dekang Ran
Cheng-Lin Liu
LRM
608
33
0
28 Feb 2025
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning
Hao Liang
Meiyi Qiang
Yongbin Li
Zefeng He
Yongzhen Guo
Z. Zhu
Wentao Zhang
Tengjiao Wang
250
4
0
26 Feb 2025
MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models
MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models
Zhen Zhang
Yue Yang
Kai Zhen
Nathan Susanj
Athanasios Mouchtaris
Siegfried Kunzmann
Zheng Zhang
458
3
0
17 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Daniel Schwalbe-Koda
B. Selman
Qingsong Wen
LRM
583
32
0
05 Feb 2025
A Review of Multimodal Explainable Artificial Intelligence: Past,
  Present and Future
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
Jing Liu
N. Shah
Ping Chen
411
21
0
18 Dec 2024
MC-LLaVA: Multi-Concept Personalized Vision-Language Model
Ruichuan An
Sihan Yang
Ming Lu
Kai Zeng
Yulin Luo
...
Qi She
Shanghang Zhang
Feiyu Xiong
Shanghang Zhang
Wentao Zhang
690
42
0
18 Nov 2024
EVQAScore: A Fine-grained Metric for Video Question Answering Data Quality Evaluation
EVQAScore: A Fine-grained Metric for Video Question Answering Data Quality Evaluation
Hao Liang
Zirong Chen
Feiyu Xiong
Wentao Zhang
329
0
0
11 Nov 2024
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Qingni Wang
Tiantian Geng
Zhiyuan Wang
Teng Wang
Bo Fu
Feng Zheng
516
14
0
10 Oct 2024
Gradual Learning: Optimizing Fine-Tuning with Partially Mastered
  Knowledge in Large Language Models
Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models
Bozhou Li
Hao Liang
Yang Li
Fangcheng Fu
Hongzhi Yin
Conghui He
Wentao Zhang
KELMCLL
228
2
0
08 Oct 2024
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
594
68
0
01 Oct 2024
Data Proportion Detection for Optimized Data Management for Large
  Language Models
Data Proportion Detection for Optimized Data Management for Large Language Models
Hao Liang
Keshi Zhao
Yajie Yang
Bin Cui
Bin Cui
Guosheng Dong
Wentao Zhang
188
0
0
26 Sep 2024
Surveying the MLLM Landscape: A Meta-Review of Current Surveys
Surveying the MLLM Landscape: A Meta-Review of Current Surveys
Ming Li
Keyu Chen
Ziqian Bi
Ming Liu
Xinyuan Song
...
Jinlang Wang
Sen Zhang
Xuanhe Pan
Jiawei Xu
Pohsun Feng
OffRL
296
11
0
17 Sep 2024
Advancing Cyber Incident Timeline Analysis Through Rule Based AI and
  Large Language Models
Advancing Cyber Incident Timeline Analysis Through Rule Based AI and Large Language ModelsDe Computis (DC), 2024
Fatma Yasmine Loumachi
Mohamed Chahine Ghanem
AI4CE
478
1
0
04 Sep 2024
MathScape: Benchmarking Multimodal Large Language Models in Real-World Mathematical Contexts
MathScape: Benchmarking Multimodal Large Language Models in Real-World Mathematical Contexts
Hao Liang
Linzhuang Sun
Tianpeng Li
Zhiyu Wu
Meiyi Qiang
Mingan Lin
Tianpeng Li
Chenzheng Zhu
Xiaoqin Huang
Yicong Chen
358
7
0
14 Aug 2024
Are Bigger Encoders Always Better in Vision Large Models?
Are Bigger Encoders Always Better in Vision Large Models?
Bozhou Li
Hao Liang
Zimo Meng
Wentao Zhang
VLM
230
5
0
01 Aug 2024
Synth-Empathy: Towards High-Quality Synthetic Empathy Data
Synth-Empathy: Towards High-Quality Synthetic Empathy Data
Hao Liang
Linzhuang Sun
Jingxuan Wei
Xijie Huang
Linkun Sun
Bihui Yu
Conghui He
Wentao Zhang
SyDa
279
8
0
31 Jul 2024
SynthVLM: Towards High-Quality and Efficient Synthesis of Image-Caption Datasets for Vision-Language Models
SynthVLM: Towards High-Quality and Efficient Synthesis of Image-Caption Datasets for Vision-Language Models
Zheng Liu
Hao Liang
Xijie Huang
Wentao Xiong
Qinhan Yu
Linzhuang Sun
Chong Chen
Huang Leng
SyDa
518
1
0
30 Jul 2024
Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development
Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development
Daoyuan Chen
Haibin Wang
Yilun Huang
Ce Ge
Yaliang Li
Bolin Ding
Jingren Zhou
VLMSyDa
291
1
0
16 Jul 2024
The Synergy between Data and Multi-Modal Large Language Models: A Survey
  from Co-Development Perspective
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
Zhen Qin
Daoyuan Chen
Wenhao Zhang
Liuyi Yao
Yilun Huang
Bolin Ding
Yaliang Li
Shuiguang Deng
364
15
0
11 Jul 2024
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Miao Zheng
H. Liang
Fan Yang
Haoze Sun
Tianpeng Li
...
Kun Fang
Weipeng Chen
Bin Cui
Wentao Zhang
Guosheng Dong
RALM
291
9
0
08 Jul 2024
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for
  Text-to-Image Generation?
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen
Yichao Du
Zichen Wen
Yiyang Zhou
Chenhang Cui
...
Jiawei Zhou
Zhuokai Zhao
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVMMLLM
360
60
0
05 Jul 2024
KeyVideoLLM: Towards Large-scale Video Keyframe Selection
KeyVideoLLM: Towards Large-scale Video Keyframe Selection
Hao Liang
Jiapeng Li
Tianyi Bai
Xijie Huang
Linzhuang Sun
Zhengren Wang
Conghui He
Bin Cui
Chong Chen
Wentao Zhang
VGen
342
31
0
03 Jul 2024
Efficient-Empathy: Towards Efficient and Effective Selection of Empathy
  Data
Efficient-Empathy: Towards Efficient and Effective Selection of Empathy Data
Linzhuang Sun
Hao Liang
Jingxuan Wei
Linkun Sun
Bihui Yu
Bin Cui
Wentao Zhang
196
2
0
02 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min Lin
MoE
418
108
1
01 Jul 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
432
120
0
25 Mar 2024
Valley: Video Assistant with Large Language model Enhanced abilitY
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
596
257
0
12 Jun 2023
1
Page 1 of 1