ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.18679
39
58

Data Interpreter: An LLM Agent For Data Science

28 February 2024
Sirui Hong
Yizhang Lin
Bang Liu
Bangbang Liu
Binhao Wu
Danyang Li
Jiaqi Chen
Jiayi Zhang
Jinlin Wang
Li Zhang
Lingyao Zhang
Min Yang
Mingchen Zhuge
Taicheng Guo
Tuo Zhou
Wei Tao
Wenyi Wang
Xiangru Tang
Xiangtao Lu
Xiawu Zheng
Xinbing Liang
Yaying Fei
Yuheng Cheng
Zongze Xu
Chenglin Wu
    LLMAG
    AI4CE
ArXivPDFHTML
Abstract

Large Language Model (LLM)-based agents have demonstrated remarkable effectiveness. However, their performance can be compromised in data science scenarios that require real-time data adjustment, expertise in optimization due to complex dependencies among various tasks, and the ability to identify logical errors for precise reasoning. In this study, we introduce the Data Interpreter, a solution designed to solve with code that emphasizes three pivotal techniques to augment problem-solving in data science: 1) dynamic planning with hierarchical graph structures for real-time data adaptability;2) tool integration dynamically to enhance code proficiency during execution, enriching the requisite expertise;3) logical inconsistency identification in feedback, and efficiency enhancement through experience recording. We evaluate the Data Interpreter on various data science and real-world tasks. Compared to open-source baselines, it demonstrated superior performance, exhibiting significant improvements in machine learning tasks, increasing from 0.86 to 0.95. Additionally, it showed a 26% increase in the MATH dataset and a remarkable 112% improvement in open-ended tasks. The solution will be released at https://github.com/geekan/MetaGPT.

View on arXiv
Comments on this paper