MAC-SLU: Multi-Intent Automotive Cabin Spoken Language Understanding Benchmark

1 December 2025

Yuezhang Peng

Chonghao Cai

Ziang Liu

Shuai Fan

Sheng Jiang

Hua Xu

Yuxin Liu

Qiguang Chen

Kele Xu

Yao Li

Sheng Wang

Libo Qin

Xie Chen

AuLLM

ArXiv (abs)PDF HTML Github

Main:4 Pages

1 Figures

Bibliography:1 Pages

6 Tables

Abstract

Spoken Language Understanding (SLU), which aims to extract user semantics to execute downstream tasks, is a crucial component of task-oriented dialog systems. Existing SLU datasets generally lack sufficient diversity and complexity, and there is an absence of a unified benchmark for the latest Large Language Models (LLMs) and Large Audio Language Models (LALMs). This work introduces MAC-SLU, a novel Multi-Intent Automotive Cabin Spoken Language Understanding Dataset, which increases the difficulty of the SLU task by incorporating authentic and complex multi-intent data. Based on MAC-SLU, we conducted a comprehensive benchmark of leading open-source LLMs and LALMs, covering methods like in-context learning, supervised fine-tuning (SFT), and end-to-end (E2E) and pipeline paradigms. Our experiments show that while LLMs and LALMs have the potential to complete SLU tasks through in-context learning, their performance still lags significantly behind SFT. Meanwhile, E2E LALMs demonstrate performance comparable to pipeline approaches and effectively avoid error propagation from speech recognition. Code\footnote{this https URL\_SLU} and datasets\footnote{this http URL\_SLU} are released publicly.

View on arXiv

Comments on this paper