Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

15 July 2024

Hongcheng Gao

Hongshen Xu

Kai Yu

Tao Yu

ArXiv (abs)PDF HTML HuggingFace (7 upvotes)

Papers citing "Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?"

22 / 22 papers shown

Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents

...

161

20 Oct 2025

LEGOMem: Modular Procedural Memory for Multi-agent LLM Systems for Workflow Automation

110

06 Oct 2025

GuirlVG: Incentivize GUI Visual Grounding via Empirical Exploration on Reinforcement Learning

152

06 Aug 2025

Large Language Model-based Data Science Agent: A Survey

LLMAG LM&Ro AI4TS LM&MA AI4CE

789

02 Aug 2025

DSBC : Data Science task Benchmarking with Context engineering

Ram Mohan Rao Kadiyala

216

31 Jul 2025

OS-MAP: How Far Can Computer-Using Agents Go in Breadth and Depth?

...

180

25 Jul 2025

Augmented Vision-Language Models: A Systematic Review

196

24 Jul 2025

Pixels, Patterns, but No Poetry: To See The World like Humans

...

160

21 Jul 2025

CRABS: A syntactic-semantic pincer strategy for bounding LLM interpretation of Python notebooks

Meng Li

Timothy M. McPhillips

Dingmin Wang

Shin-Rong Tsai

Bertram Ludäscher

132

15 Jul 2025

Multi-level Value Alignment in Agentic AI Systems: Survey and Perspectives

...

440

11 Jun 2025

What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities

...

226

10 Jun 2025

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

...

511

26 May 2025

Visual Test-time Scaling for GUI Agent Grounding

379

01 May 2025

ELT-Bench: An End-to-End Benchmark for Evaluating AI Agents on ELT Pipelines

323

07 Apr 2025

LLM Agents for Education: Advances and Applications

...

336

14 Mar 2025

From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support SystemsConference on Empirical Methods in Natural Language Processing (EMNLP), 2025

...

509

03 Mar 2025

CoddLLM: Empowering Large Language Models for Data Analytics

Asterios Katsifodimos

900

01 Feb 2025

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL WorkflowsInternational Conference on Learning Representations (ICLR), 2024

...

388

128

12 Nov 2024

COMMA: A Communicative Multimodal Multi-Agent Benchmark

538

10 Oct 2024

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI AgentsInternational Conference on Learning Representations (ICLR), 2024

641

238

07 Oct 2024

Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual PromptsInternational Conference on Computational Linguistics (COLING), 2024

456

17 Sep 2024

Text2SQL is Not Enough: Unifying AI and Databases with TAG

195

27 Aug 2024