24
0

Towards Artificial Intelligence Research Assistant for Expert-Involved Learning

Abstract

Large Language Models (LLMs) and Large Multi-Modal Models (LMMs) have emerged as transformative tools in scientific research, yet their reliability and specific contributions to biomedical applications remain insufficiently characterized. In this study, we present \textbf{AR}tificial \textbf{I}ntelligence research assistant for \textbf{E}xpert-involved \textbf{L}earning (ARIEL), a multimodal dataset designed to benchmark and enhance two critical capabilities of LLMs and LMMs in biomedical research: summarizing extensive scientific texts and interpreting complex biomedical figures. To facilitate rigorous assessment, we create two open-source sets comprising biomedical articles and figures with designed questions. We systematically benchmark both open- and closed-source foundation models, incorporating expert-driven human evaluations conducted by doctoral-level experts. Furthermore, we improve model performance through targeted prompt engineering and fine-tuning strategies for summarizing research papers, and apply test-time computational scaling to enhance the reasoning capabilities of LMMs, achieving superior accuracy compared to human-expert corrections. We also explore the potential of using LMM Agents to generate scientific hypotheses from diverse multimodal inputs. Overall, our results delineate clear strengths and highlight significant limitations of current foundation models, providing actionable insights and guiding future advancements in deploying large-scale language and multi-modal models within biomedical research.

View on arXiv
@article{liu2025_2505.04638,
  title={ Towards Artificial Intelligence Research Assistant for Expert-Involved Learning },
  author={ Tianyu Liu and Simeng Han and Xiao Luo and Hanchen Wang and Pan Lu and Biqing Zhu and Yuge Wang and Keyi Li and Jiapeng Chen and Rihao Qu and Yufeng Liu and Xinyue Cui and Aviv Yaish and Yuhang Chen and Minsheng Hao and Chuhan Li and Kexing Li and Arman Cohan and Hua Xu and Mark Gerstein and James Zou and Hongyu Zhao },
  journal={arXiv preprint arXiv:2505.04638},
  year={ 2025 }
}
Comments on this paper