ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.13032
  4. Cited By
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

19 May 2025
Ziyang Ma
Yinghao Ma
Yanqiao Zhu
Chen Yang
Yi-Wen Chao
Ruiyang Xu
Wenxi Chen
Yuanzhe Chen
Zhuo Chen
Jian Cong
Kai Li
Keliang Li
Siyou Li
Guojian Pang
Xiquan Li
Zheng Lian
Yuzhe Liang
Minghao Liu
Zhikang Niu
Tianrui Wang
Yuping Wang
Yuping Wang
Y. Wu
Guanrou Yang
Jianwei Yu
Ruibin Yuan
Zhisheng Zheng
Ziya Zhou
Haina Zhu
Wei Xue
Emmanouil Benetos
Kai Yu
Xiaofeng Wang
Xie Chen
    AuLLMLRM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix"

45 / 45 papers shown
HPSU: A Benchmark for Human-Level Perception in Real-World Spoken Speech Understanding
HPSU: A Benchmark for Human-Level Perception in Real-World Spoken Speech Understanding
Chen Li
Peiji Yang
Yicheng Zhong
Jianxing Yu
Zhisheng Wang
Zihao Gou
Wenqing Chen
Jian Yin
VLM
154
0
0
28 Nov 2025
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Priyanka Kargupta
Shuyue Stella Li
Haocheng Wang
Jinu Lee
Shan Chen
...
Thomas L. Griffiths
Max Kleiman-Weiner
Jiawei Han
Asli Celikyilmaz
Yulia Tsvetkov
LRM
207
2
0
20 Nov 2025
Revisiting Audio-language Pretraining for Learning General-purpose Audio Representation
Revisiting Audio-language Pretraining for Learning General-purpose Audio Representation
Wei-Cheng Tseng
Xuanru Zhou
Mingyue Huo
Yiwen Shao
Hao Zhang
Dong Yu
CLIPAI4TSVLM
161
0
0
20 Nov 2025
SAR-LM: Symbolic Audio Reasoning with Large Language Models
SAR-LM: Symbolic Audio Reasoning with Large Language Models
Termeh Taheri
Yinghao Ma
Emmanouil Benetos
AuLLMLRM
207
0
0
09 Nov 2025
Lost in Phonation: Voice Quality Variation as an Evaluation Dimension for Speech Foundation Models
Lost in Phonation: Voice Quality Variation as an Evaluation Dimension for Speech Foundation Models
Harm Lameris
Shree Harsha Bokkahalli Satish
Joakim Gustafson
Éva Székely
104
0
0
29 Oct 2025
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence
Zihan Liu
Zhikang Niu
Qiuyang Xiao
Zhisheng Zheng
Ruoqi Yuan
...
Jianze Liang
Xie Chen
Leilei Sun
Dahua Lin
Jiaqi Wang
AuLLMLRM
478
4
0
28 Oct 2025
Evaluating Multimodal Large Language Models on Core Music Perception Tasks
Evaluating Multimodal Large Language Models on Core Music Perception Tasks
Brandon James Carone
Iran R. Roman
Pablo Ripollés
LRM
164
1
0
25 Oct 2025
Incentivizing Consistent, Effective and Scalable Reasoning Capability in Audio LLMs via Reasoning Process Rewards
Incentivizing Consistent, Effective and Scalable Reasoning Capability in Audio LLMs via Reasoning Process Rewards
Jiajun Fan
Roger Ren
Jingyuan Li
R. Pandey
Prashanth Gurunath Shivakumar
I. Bulyko
Ankur Gandhe
Ge Liu
Yile Gu
LRM
147
1
0
23 Oct 2025
The MUSE Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMS
The MUSE Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMS
Brandon James Carone
Iran R. Roman
Pablo Ripollés
AuLLMLRM
157
3
0
21 Oct 2025
VocalBench-DF: A Benchmark for Evaluating Speech LLM Robustness to Disfluency
VocalBench-DF: A Benchmark for Evaluating Speech LLM Robustness to Disfluency
Hongcheng Liu
Yixuan Hou
Heyang Liu
Yuhao Wang
Yanfeng Wang
Y Samuel Wang
AuLLM
200
1
0
17 Oct 2025
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Hanrong Ye
Chao-Han Huck Yang
Arushi Goel
Wei Huang
Ligeng Zhu
...
Andrew Tao
Song Han
Jan Kautz
Hongxu Yin
Pavlo Molchanov
183
3
0
17 Oct 2025
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception
Ziyang Ma
Ruiyang Xu
Zhenghao Xing
Yunfei Chu
Yuping Wang
...
Pheng-Ann Heng
Kai Yu
Junyang Lin
Eng Siong Chng
Xie Chen
VLM
90
2
0
14 Oct 2025
VCB Bench: An Evaluation Benchmark for Audio-Grounded Large Language Model Conversational Agents
VCB Bench: An Evaluation Benchmark for Audio-Grounded Large Language Model Conversational Agents
Jiliang Hu
Wenfu Wang
Zuchao Li
Chenxing Li
Yiyang Zhao
Hanzhao Li
Liqiang Zhang
Meng Yu
Dong Yu
ELM
143
2
0
13 Oct 2025
UALM: Unified Audio Language Model for Understanding, Generation and Reasoning
UALM: Unified Audio Language Model for Understanding, Generation and Reasoning
Jinchuan Tian
Sang-gil Lee
Zhifeng Kong
Sreyan Ghosh
Arushi Goel
...
Shinji Watanabe
Mohammad Shoeybi
Bryan Catanzaro
Rafael Valle
Wei Ping
AuLLMLRM
290
1
0
13 Oct 2025
AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Peize He
Zichen Wen
Yubo Wang
Y. Wang
Xiaoqian Liu
...
Zhifei Liu
Weijia Li
C. Wang
Conghui He
Linfeng Zhang
AuLLM
189
3
0
08 Oct 2025
AURA Score: A Metric For Holistic Audio Question Answering Evaluation
AURA Score: A Metric For Holistic Audio Question Answering Evaluation
Satvik Dixit
Soham Deshmukh
Bhiksha Raj
112
0
0
06 Oct 2025
Robustness assessment of large audio language models in multiple-choice evaluation
Robustness assessment of large audio language models in multiple-choice evaluation
F. López
Santosh Kesiraju
Jordi Luque
AuLLMELM
162
0
0
06 Oct 2025
AudioToolAgent: An Agentic Framework for Audio-Language Models
AudioToolAgent: An Agentic Framework for Audio-Language Models
Gijs Wijngaard
Elia Formisano
M. Dumontier
LLMAGAuLLM
129
0
0
03 Oct 2025
PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation
PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation
Yujia Xiao
Liumeng Xue
Lei He
Xinyi Chen
Aemon Yat Fei Chiu
...
Shaofei Zhang
Qiuqiang Kong
Xinfa Zhu
Wei Xue
Tan Lee
AuLLMVGen
149
1
0
01 Oct 2025
Hearing the Order: Investigating Selection Bias in Large Audio-Language Models
Hearing the Order: Investigating Selection Bias in Large Audio-Language Models
Yu-Xiang Lin
Chen-An Li
Sheng-Lun Wei
Po-Chun Chen
Hsin-Hsi Chen
Hung-yi Lee
135
0
0
01 Oct 2025
When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models
When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models
Chen-An Li
Tzu-Han Lin
Hung-yi Lee
AuLLM
151
2
0
01 Oct 2025
When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs
When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs
Shree Harsha Bokkahalli Satish
G. Henter
Éva Székely
161
1
0
01 Oct 2025
Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap
Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap
Yueqian Lin
Zhengmian Hu
Qinsi Wang
Yudong Liu
H. Zhang
Jayakumar Subramanian
N. Vlassis
Hai Helen Li
Yiran Chen
LRM
113
1
0
30 Sep 2025
Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models
Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models
Zhichao Sheng
Shilin Zhou
Chen Gong
Zhenghua Li
AuLLMLRM
315
0
0
26 Sep 2025
CMDAR: A Chinese Multi-scene Dynamic Audio Reasoning Benchmark with Diverse Challenges
CMDAR: A Chinese Multi-scene Dynamic Audio Reasoning Benchmark with Diverse Challenges
Hui Li
Changhao Jiang
Hongyu Wang
Ming Zhang
Jiajun Sun
...
Baoyu Fan
Changzhi Sun
Tao Gui
Qi Zhang
Xuanjing Huang
AuLLMELM
129
0
0
26 Sep 2025
WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM
WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM
Changli Tang
Qinfan Xiao
Ke Mei
Tianyi Wang
Fengyun Rao
Chao Zhang
118
0
0
26 Sep 2025
Investigating Faithfulness in Large Audio Language Models
Investigating Faithfulness in Large Audio Language Models
Lovenya Jain
Pooneh Mousavi
Mirco Ravanelli
Cem Subakan
165
0
0
26 Sep 2025
Pay More Attention To Audio: Mitigating Imbalance of Cross-Modal Attention in Large Audio Language Models
Pay More Attention To Audio: Mitigating Imbalance of Cross-Modal Attention in Large Audio Language Models
Junyu Wang
Ziyang Ma
Zhengding Luo
Tianrui Wang
Meng Ge
Xiaobao Wang
Longbiao Wang
AuLLM
94
0
0
23 Sep 2025
AudioGenie-Reasoner: A Training-Free Multi-Agent Framework for Coarse-to-Fine Audio Deep Reasoning
AudioGenie-Reasoner: A Training-Free Multi-Agent Framework for Coarse-to-Fine Audio Deep Reasoning
Yan Rong
Chenxing Li
Dong Yu
Li Liu
AuLLMLRM
217
0
0
21 Sep 2025
Omni-CLST: Error-aware Curriculum Learning with guided Selective chain-of-Thought for audio question answering
Omni-CLST: Error-aware Curriculum Learning with guided Selective chain-of-Thought for audio question answering
Jinghua Zhao
Hang Su
Lichun Fan
Zhenbo Luo
Hui Wang
Haoqin Sun
Yong Qin
AuLLMLRM
186
0
0
14 Sep 2025
WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning
WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning
Gagan Mundada
Yash Vishe
Amit Namburi
Xin Xu
Zachary Novack
Julian McAuley
Junda Wu
LRM
123
3
0
05 Sep 2025
Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding
Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding
Zhifeng Kong
Arushi Goel
J. F. Santos
Sreyan Ghosh
Rafael Valle
Wei Ping
Bryan Catanzaro
ReLMAuLLMLRM
178
3
0
15 Aug 2025
Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning
Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning
Shu Wu
Chenxing Li
Wenfu Wang
Hao Zhang
H. Wang
Meng Yu
Dong Yu
AuLLMKELMLRM
250
8
0
11 Aug 2025
SpeechR: A Benchmark for Speech Reasoning in Large Audio-Language Models
SpeechR: A Benchmark for Speech Reasoning in Large Audio-Language Models
Wanqi Yang
Yanda Li
Yunchao Wei
Meng Fang
Ling-Hao Chen
AuLLMReLMLRM
149
6
0
04 Aug 2025
MECAT: A Multi-Experts Constructed Benchmark for Fine-Grained Audio Understanding Tasks
MECAT: A Multi-Experts Constructed Benchmark for Fine-Grained Audio Understanding Tasks
Yadong Niu
Tianzi Wang
Heinrich Dinkel
Xingwei Sun
Jiahao Zhou
Gang Li
Jizhong Liu
Xunying Liu
Junbo Zhang
Jian Luan
AuLLM
227
3
0
31 Jul 2025
From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data
From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data
Chun-Yi Kuan
Hung-yi Lee
AuLLM
302
0
0
26 May 2025
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
Andrew Rouditchenko
Saurabhchand Bhati
Edson Araujo
Samuel Thomas
Hilde Kuehne
Rogerio Feris
James R. Glass
AuLLMVLM
334
23
0
14 May 2025
SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning
SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning
Cheng Wen
Tingwei Guo
Shuaijiang Zhao
Wei Zou
Xiangang Li
OffRLAuLLMLRM
364
18
0
22 Apr 2025
Qwen2.5-Omni Technical Report
Qwen2.5-Omni Technical Report
Jin Xu
Zhifang Guo
Jinzheng He
Hangrui Hu
Ting He
...
K. Dang
Bin Zhang
Xinyu Wang
Yunfei Chu
Junyang Lin
VGenAuLLM
1.2K
344
0
26 Mar 2025
Mellow: a small audio language model for reasoning
Soham Deshmukh
Satvik Dixit
Rita Singh
Bhiksha Raj
AuLLMReLMLRM
290
17
0
11 Mar 2025
Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models
Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models
Zhifei Xie
Mingbao Lin
Ziqiang Liu
Pengcheng Wu
Shuicheng Yan
Chunyan Miao
AuLLMOffRLLRM
401
69
0
04 Mar 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
Qingbin Liu
Tao Zhang
Tao Zhang
Tian Jin
...
Jianhua Xu
Haoze Sun
Mingan Lin
Guosheng Dong
Xin Wu
AuLLM
328
66
0
28 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
OffRLAI4TSLRMReLMVLM
1.2K
5,498
0
22 Jan 2025
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Tianhao Shen
Zhuo Chen
Longji Xu
Xiaofeng Wang
Xie Chen
AuLLMLRM
276
43
0
13 Jan 2025
AudioBench: A Universal Benchmark for Audio Large Language Models
AudioBench: A Universal Benchmark for Audio Large Language Models
Bin Wang
Xunlong Zou
Geyu Lin
Siyang Song
Zhuohan Liu
Wenyu Zhang
Zhengyuan Liu
AiTi Aw
Nancy F. Chen
AuLLMELMLM&MA
585
79
0
23 Jun 2024
1