Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.13032
Cited By
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
19 May 2025
Ziyang Ma
Yinghao Ma
Yanqiao Zhu
Chen Yang
Yi-Wen Chao
Ruiyang Xu
Wenxi Chen
Yuanzhe Chen
Zhuo Chen
Jian Cong
Kai Li
Keliang Li
Siyou Li
Guojian Pang
Xiquan Li
Zheng Lian
Yuzhe Liang
Minghao Liu
Zhikang Niu
Tianrui Wang
Yuping Wang
Yuping Wang
Y. Wu
Guanrou Yang
Jianwei Yu
Ruibin Yuan
Zhisheng Zheng
Ziya Zhou
Haina Zhu
Wei Xue
Emmanouil Benetos
Kai Yu
Xiaofeng Wang
Xie Chen
AuLLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix"
45 / 45 papers shown
HPSU: A Benchmark for Human-Level Perception in Real-World Spoken Speech Understanding
Chen Li
Peiji Yang
Yicheng Zhong
Jianxing Yu
Zhisheng Wang
Zihao Gou
Wenqing Chen
Jian Yin
VLM
154
0
0
28 Nov 2025
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Priyanka Kargupta
Shuyue Stella Li
Haocheng Wang
Jinu Lee
Shan Chen
...
Thomas L. Griffiths
Max Kleiman-Weiner
Jiawei Han
Asli Celikyilmaz
Yulia Tsvetkov
LRM
207
2
0
20 Nov 2025
Revisiting Audio-language Pretraining for Learning General-purpose Audio Representation
Wei-Cheng Tseng
Xuanru Zhou
Mingyue Huo
Yiwen Shao
Hao Zhang
Dong Yu
CLIP
AI4TS
VLM
161
0
0
20 Nov 2025
SAR-LM: Symbolic Audio Reasoning with Large Language Models
Termeh Taheri
Yinghao Ma
Emmanouil Benetos
AuLLM
LRM
207
0
0
09 Nov 2025
Lost in Phonation: Voice Quality Variation as an Evaluation Dimension for Speech Foundation Models
Harm Lameris
Shree Harsha Bokkahalli Satish
Joakim Gustafson
Éva Székely
104
0
0
29 Oct 2025
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence
Zihan Liu
Zhikang Niu
Qiuyang Xiao
Zhisheng Zheng
Ruoqi Yuan
...
Jianze Liang
Xie Chen
Leilei Sun
Dahua Lin
Jiaqi Wang
AuLLM
LRM
478
4
0
28 Oct 2025
Evaluating Multimodal Large Language Models on Core Music Perception Tasks
Brandon James Carone
Iran R. Roman
Pablo Ripollés
LRM
164
1
0
25 Oct 2025
Incentivizing Consistent, Effective and Scalable Reasoning Capability in Audio LLMs via Reasoning Process Rewards
Jiajun Fan
Roger Ren
Jingyuan Li
R. Pandey
Prashanth Gurunath Shivakumar
I. Bulyko
Ankur Gandhe
Ge Liu
Yile Gu
LRM
147
1
0
23 Oct 2025
The MUSE Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMS
Brandon James Carone
Iran R. Roman
Pablo Ripollés
AuLLM
LRM
157
3
0
21 Oct 2025
VocalBench-DF: A Benchmark for Evaluating Speech LLM Robustness to Disfluency
Hongcheng Liu
Yixuan Hou
Heyang Liu
Yuhao Wang
Yanfeng Wang
Y Samuel Wang
AuLLM
200
1
0
17 Oct 2025
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Hanrong Ye
Chao-Han Huck Yang
Arushi Goel
Wei Huang
Ligeng Zhu
...
Andrew Tao
Song Han
Jan Kautz
Hongxu Yin
Pavlo Molchanov
183
3
0
17 Oct 2025
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception
Ziyang Ma
Ruiyang Xu
Zhenghao Xing
Yunfei Chu
Yuping Wang
...
Pheng-Ann Heng
Kai Yu
Junyang Lin
Eng Siong Chng
Xie Chen
VLM
90
2
0
14 Oct 2025
VCB Bench: An Evaluation Benchmark for Audio-Grounded Large Language Model Conversational Agents
Jiliang Hu
Wenfu Wang
Zuchao Li
Chenxing Li
Yiyang Zhao
Hanzhao Li
Liqiang Zhang
Meng Yu
Dong Yu
ELM
143
2
0
13 Oct 2025
UALM: Unified Audio Language Model for Understanding, Generation and Reasoning
Jinchuan Tian
Sang-gil Lee
Zhifeng Kong
Sreyan Ghosh
Arushi Goel
...
Shinji Watanabe
Mohammad Shoeybi
Bryan Catanzaro
Rafael Valle
Wei Ping
AuLLM
LRM
290
1
0
13 Oct 2025
AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Peize He
Zichen Wen
Yubo Wang
Y. Wang
Xiaoqian Liu
...
Zhifei Liu
Weijia Li
C. Wang
Conghui He
Linfeng Zhang
AuLLM
189
3
0
08 Oct 2025
AURA Score: A Metric For Holistic Audio Question Answering Evaluation
Satvik Dixit
Soham Deshmukh
Bhiksha Raj
112
0
0
06 Oct 2025
Robustness assessment of large audio language models in multiple-choice evaluation
F. López
Santosh Kesiraju
Jordi Luque
AuLLM
ELM
162
0
0
06 Oct 2025
AudioToolAgent: An Agentic Framework for Audio-Language Models
Gijs Wijngaard
Elia Formisano
M. Dumontier
LLMAG
AuLLM
129
0
0
03 Oct 2025
PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation
Yujia Xiao
Liumeng Xue
Lei He
Xinyi Chen
Aemon Yat Fei Chiu
...
Shaofei Zhang
Qiuqiang Kong
Xinfa Zhu
Wei Xue
Tan Lee
AuLLM
VGen
149
1
0
01 Oct 2025
Hearing the Order: Investigating Selection Bias in Large Audio-Language Models
Yu-Xiang Lin
Chen-An Li
Sheng-Lun Wei
Po-Chun Chen
Hsin-Hsi Chen
Hung-yi Lee
135
0
0
01 Oct 2025
When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models
Chen-An Li
Tzu-Han Lin
Hung-yi Lee
AuLLM
151
2
0
01 Oct 2025
When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs
Shree Harsha Bokkahalli Satish
G. Henter
Éva Székely
161
1
0
01 Oct 2025
Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap
Yueqian Lin
Zhengmian Hu
Qinsi Wang
Yudong Liu
H. Zhang
Jayakumar Subramanian
N. Vlassis
Hai Helen Li
Yiran Chen
LRM
113
1
0
30 Sep 2025
Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models
Zhichao Sheng
Shilin Zhou
Chen Gong
Zhenghua Li
AuLLM
LRM
315
0
0
26 Sep 2025
CMDAR: A Chinese Multi-scene Dynamic Audio Reasoning Benchmark with Diverse Challenges
Hui Li
Changhao Jiang
Hongyu Wang
Ming Zhang
Jiajun Sun
...
Baoyu Fan
Changzhi Sun
Tao Gui
Qi Zhang
Xuanjing Huang
AuLLM
ELM
129
0
0
26 Sep 2025
WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM
Changli Tang
Qinfan Xiao
Ke Mei
Tianyi Wang
Fengyun Rao
Chao Zhang
118
0
0
26 Sep 2025
Investigating Faithfulness in Large Audio Language Models
Lovenya Jain
Pooneh Mousavi
Mirco Ravanelli
Cem Subakan
165
0
0
26 Sep 2025
Pay More Attention To Audio: Mitigating Imbalance of Cross-Modal Attention in Large Audio Language Models
Junyu Wang
Ziyang Ma
Zhengding Luo
Tianrui Wang
Meng Ge
Xiaobao Wang
Longbiao Wang
AuLLM
94
0
0
23 Sep 2025
AudioGenie-Reasoner: A Training-Free Multi-Agent Framework for Coarse-to-Fine Audio Deep Reasoning
Yan Rong
Chenxing Li
Dong Yu
Li Liu
AuLLM
LRM
217
0
0
21 Sep 2025
Omni-CLST: Error-aware Curriculum Learning with guided Selective chain-of-Thought for audio question answering
Jinghua Zhao
Hang Su
Lichun Fan
Zhenbo Luo
Hui Wang
Haoqin Sun
Yong Qin
AuLLM
LRM
186
0
0
14 Sep 2025
WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning
Gagan Mundada
Yash Vishe
Amit Namburi
Xin Xu
Zachary Novack
Julian McAuley
Junda Wu
LRM
123
3
0
05 Sep 2025
Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding
Zhifeng Kong
Arushi Goel
J. F. Santos
Sreyan Ghosh
Rafael Valle
Wei Ping
Bryan Catanzaro
ReLM
AuLLM
LRM
178
3
0
15 Aug 2025
Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning
Shu Wu
Chenxing Li
Wenfu Wang
Hao Zhang
H. Wang
Meng Yu
Dong Yu
AuLLM
KELM
LRM
250
8
0
11 Aug 2025
SpeechR: A Benchmark for Speech Reasoning in Large Audio-Language Models
Wanqi Yang
Yanda Li
Yunchao Wei
Meng Fang
Ling-Hao Chen
AuLLM
ReLM
LRM
149
6
0
04 Aug 2025
MECAT: A Multi-Experts Constructed Benchmark for Fine-Grained Audio Understanding Tasks
Yadong Niu
Tianzi Wang
Heinrich Dinkel
Xingwei Sun
Jiahao Zhou
Gang Li
Jizhong Liu
Xunying Liu
Junbo Zhang
Jian Luan
AuLLM
227
3
0
31 Jul 2025
From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data
Chun-Yi Kuan
Hung-yi Lee
AuLLM
302
0
0
26 May 2025
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
Andrew Rouditchenko
Saurabhchand Bhati
Edson Araujo
Samuel Thomas
Hilde Kuehne
Rogerio Feris
James R. Glass
AuLLM
VLM
334
23
0
14 May 2025
SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning
Cheng Wen
Tingwei Guo
Shuaijiang Zhao
Wei Zou
Xiangang Li
OffRL
AuLLM
LRM
364
18
0
22 Apr 2025
Qwen2.5-Omni Technical Report
Jin Xu
Zhifang Guo
Jinzheng He
Hangrui Hu
Ting He
...
K. Dang
Bin Zhang
Xinyu Wang
Yunfei Chu
Junyang Lin
VGen
AuLLM
1.2K
344
0
26 Mar 2025
Mellow: a small audio language model for reasoning
Soham Deshmukh
Satvik Dixit
Rita Singh
Bhiksha Raj
AuLLM
ReLM
LRM
290
17
0
11 Mar 2025
Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models
Zhifei Xie
Mingbao Lin
Ziqiang Liu
Pengcheng Wu
Shuicheng Yan
Chunyan Miao
AuLLM
OffRL
LRM
401
69
0
04 Mar 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
Qingbin Liu
Tao Zhang
Tao Zhang
Tian Jin
...
Jianhua Xu
Haoze Sun
Mingan Lin
Guosheng Dong
Xin Wu
AuLLM
328
66
0
28 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
OffRL
AI4TS
LRM
ReLM
VLM
1.2K
5,498
0
22 Jan 2025
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Tianhao Shen
Zhuo Chen
Longji Xu
Xiaofeng Wang
Xie Chen
AuLLM
LRM
276
43
0
13 Jan 2025
AudioBench: A Universal Benchmark for Audio Large Language Models
Bin Wang
Xunlong Zou
Geyu Lin
Siyang Song
Zhuohan Liu
Wenyu Zhang
Zhengyuan Liu
AiTi Aw
Nancy F. Chen
AuLLM
ELM
LM&MA
585
79
0
23 Jun 2024
1