ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.07729
  4. Cited By
AIR-Bench: Benchmarking Large Audio-Language Models via Generative
  Comprehension

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

12 February 2024
Qian Yang
Jin Xu
Wenrui Liu
Yunfei Chu
Ziyue Jiang
Xiaohuan Zhou
Yichong Leng
Yuanjun Lv
Zhou Zhao
Chang Zhou
Jingren Zhou
    LM&MA
    AuLLM
    ALM
ArXivPDFHTML

Papers citing "AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension"

48 / 48 papers shown
Title
Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge
Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge
Chao-Han Huck Yang
Sreyan Ghosh
Qing Wang
Jaeyeon Kim
Hengyi Hong
...
Dinesh Manocha
Gunhee Kim
Jun Du
Rafael Valle
Bryan Catanzaro
16
0
0
12 May 2025
BLAB: Brutally Long Audio Bench
BLAB: Brutally Long Audio Bench
Orevaoghene Ahia
Martijn Bartelds
Kabir Ahuja
Hila Gonen
Valentin Hofmann
...
Noah Bennett
Shinji Watanabe
Noah A. Smith
Yulia Tsvetkov
Sachin Kumar
AuLLM
LM&MA
VLM
43
0
0
05 May 2025
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
Prabhat Pandey
R. Swaminathan
K V Vijay Girish
Arunasish Sen
Jian Xie
Grant P. Strimel
Andreas Schwarz
36
0
0
12 Apr 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
36
1
0
11 Apr 2025
FinAudio: A Benchmark for Audio Large Language Models in Financial Applications
FinAudio: A Benchmark for Audio Large Language Models in Financial Applications
Yupeng Cao
Haohang Li
Yangyang Yu
Shashidhar Reddy Javaji
Yueru He
...
Xiao-Yang Liu
K. P. Subbalakshmi
Meikang Qiu
Sophia Ananiadou
J. Nie
AuLLM
67
0
0
26 Mar 2025
sudo rm -rf agentic_security
sudo rm -rf agentic_security
Sejin Lee
Jian Kim
Haon Park
Ashkan Yousefpour
Sangyoon Yu
Min Song
AAML
78
0
0
26 Mar 2025
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
Siyin Wang
Wenyi Yu
Xianzhao Chen
Xiaohai Tian
J. Zhang
Lu Lu
Yu Tsao
Junichi Yamagishi
Y. Wang
Chao Zhang
AuLLM
74
0
0
26 Mar 2025
The Deployment of End-to-End Audio Language Models Should Take into Account the Principle of Least Privilege
The Deployment of End-to-End Audio Language Models Should Take into Account the Principle of Least Privilege
Luxi He
Xiangyu Qi
Michel Liao
Inyoung Cheong
Prateek Mittal
Danqi Chen
Peter Henderson
AuLLM
54
0
0
21 Mar 2025
Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context
Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context
Junyi Ao
Dekun Chen
Xiaohai Tian
Wenjie Feng
J. Zhang
Lu Lu
Y. Wang
Haizhou Li
Zhizheng Wu
AuLLM
61
0
0
19 Mar 2025
Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation
Henglyu Liu
Andong Chen
Kehai Chen
X. Bai
M. Zhong
Yuan Qiu
Min Zhang
34
0
0
13 Mar 2025
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information
Feng Jiang
Zhiyu Lin
Fan Bu
Yuhao Du
Benyou Wang
H. Li
AuLLM
ELM
88
0
0
07 Mar 2025
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
Sreyan Ghosh
Zhifeng Kong
Sonal Kumar
S. Sakshi
Jaehyeon Kim
Wei Ping
Rafael Valle
Dinesh Manocha
Bryan Catanzaro
MLLM
AuLLM
LRM
49
4
0
06 Mar 2025
Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models
Zhifei Xie
Mingbao Lin
Z. Liu
Pengcheng Wu
Shuicheng Yan
Chunyan Miao
AuLLM
OffRL
LRM
72
5
0
04 Mar 2025
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
Dingdong Wang
Jin Xu
Ruihang Chu
Zhifang Guo
X. Wang
Jincenzi Wu
Dongchao Yang
Shengpeng Ji
Junyang Lin
AuLLM
80
0
0
04 Mar 2025
Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
Siddhant Arora
Zhiyun Lu
Chung-Cheng Chiu
Ruoming Pang
Shinji Watanabe
43
2
0
03 Mar 2025
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Abdelrahman Abouelenin
Atabak Ashfaq
Adam Atkinson
Hany Awadalla
Nguyen Bach
...
Ishmam Zabir
Yunan Zhang
Li Zhang
Y. Zhang
Xiren Zhou
MoE
SyDa
68
18
0
03 Mar 2025
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Heeseung Kim
Che Hyun Lee
S. Park
Jiheum Yeom
Nohil Park
Sangwon Yu
Sungroh Yoon
59
0
0
27 Feb 2025
Nexus-O: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision
Che Liu
Yingji Zhang
D. Zhang
Weijie Zhang
Chenggong Gong
...
André Freitas
Qifan Wang
Z. Xu
Rongjuncheng Zhang
Yong Dai
AuLLM
61
0
0
26 Feb 2025
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders
Weiqiao Shan
Y. Li
Yuhao Zhang
Yingfeng Luo
Chen Xu
...
Y. Lu
M. Zhang
Hao Yang
Tong Xiao
Jingbo Zhu
AuLLM
57
0
0
24 Feb 2025
Audio-FLAN: A Preliminary Release
Audio-FLAN: A Preliminary Release
Liumeng Xue
Ziya Zhou
J. Pan
Z. Li
Shuai Fan
...
Haohe Liu
Emmanouil Benetos
Ge Zhang
Yike Guo
Wei Xue
MLLM
AuLLM
CLIP
VLM
57
1
0
23 Feb 2025
Chain-of-Description: What I can understand, I can put into words
Chain-of-Description: What I can understand, I can put into words
J. Guo
Daimeng Wei
Z. Li
Hengchao Shang
Yuanchang Luo
Hao Yang
37
0
0
22 Feb 2025
Soundwave: Less is More for Speech-Text Alignment in LLMs
Soundwave: Less is More for Speech-Text Alignment in LLMs
Y. Zhang
Zhiheng Liu
Fan Bu
Ruiyu Zhang
Benyou Wang
H. Li
AuLLM
SyDa
VLM
98
0
0
18 Feb 2025
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Ke-Han Lu
Zhehuai Chen
Szu-Wei Fu
Chao-Han Huck Yang
Jagadeesh Balam
Boris Ginsburg
Yu-Te Wang
Hung-yi Lee
AuLLM
SyDa
92
5
0
28 Jan 2025
Audio-Language Models for Audio-Centric Tasks: A survey
Yi Su
Jisheng Bai
Qisheng Xu
Kele Xu
Yong Dou
AuLLM
99
1
0
28 Jan 2025
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Z. Ma
Zhuo Chen
Y. Wang
Eng Siong Chng
Xie Chen
AuLLM
LRM
62
7
0
13 Jan 2025
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
Chun-Yi Kuan
Hung-yi Lee
AuLLM
LRM
56
1
0
03 Jan 2025
OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Xize Cheng
Dongjie Fu
Xiaoda Yang
Minghui Fang
Ruofan Hu
...
Rongjie Huang
Linjun Li
Yu Chen
Tao Jin
Zhou Zhao
41
1
0
03 Jan 2025
LIFT: Improving Long Context Understanding Through Long Input
  Fine-Tuning
LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning
Yansheng Mao
Jiaqi Li
Fanxu Meng
Jing Xiong
Zilong Zheng
Muhan Zhang
LLMAG
RALM
90
1
0
18 Dec 2024
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
S. Sakshi
Utkarsh Tyagi
Sonal Kumar
Ashish Seth
Ramaneswaran Selvakumar
Oriol Nieto
R. Duraiswami
Sreyan Ghosh
Dinesh Manocha
AuLLM
ELM
65
19
0
24 Oct 2024
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
Yifan Peng
Krishna C. Puvvada
Zhehuai Chen
Piotr .Zelasko
He Huang
Kunal Dhawan
Ke Hu
Shinji Watanabe
Jagadeesh Balam
Boris Ginsburg
41
2
0
23 Oct 2024
VoiceBench: Benchmarking LLM-Based Voice Assistants
VoiceBench: Benchmarking LLM-Based Voice Assistants
Yiming Chen
Xianghu Yue
Chen Zhang
Xiaoxue Gao
R. Tan
H. Li
ELM
AuLLM
26
17
0
22 Oct 2024
Roadmap towards Superhuman Speech Understanding using Large Language
  Models
Roadmap towards Superhuman Speech Understanding using Large Language Models
Fan Bu
Yuhao Zhang
X. Wang
Benyou Wang
Q. Liu
H. Li
LM&MA
ELM
AuLLM
33
1
0
17 Oct 2024
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
57
14
0
01 Oct 2024
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large
  Language Models
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
Yiming Chen
Xianghu Yue
Xiaoxue Gao
Chen Zhang
L. F. D’Haro
R. Tan
Haizhou Li
AuLLM
22
0
0
27 Sep 2024
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Siyin Wang
Wenyi Yu
Yudong Yang
Changli Tang
Yixuan Li
...
Jun Zhang
Guangzhi Sun
Lu Lu
Yuxuan Wang
Chao Zhang
AuLLM
LM&MA
65
5
0
25 Sep 2024
OmniBench: Towards The Future of Universal Omni-Language Models
OmniBench: Towards The Future of Universal Omni-Language Models
Yizhi Li
Ge Zhang
Yinghao Ma
Ruibin Yuan
Kang Zhu
...
Zhaoxiang Zhang
Zachary Liu
Emmanouil Benetos
Wenhao Huang
Chenghua Lin
LRM
32
11
0
23 Sep 2024
What Are They Doing? Joint Audio-Speech Co-Reasoning
What Are They Doing? Joint Audio-Speech Co-Reasoning
Yingzhi Wang
Pooneh Mousavi
Artem Ploujnikov
Mirco Ravanelli
AuLLM
41
0
0
22 Sep 2024
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
Lin Li
Guikun Chen
Hanrong Shi
Jun Xiao
Long Chen
34
8
0
21 Sep 2024
Salmon: A Suite for Acoustic Language Model Evaluation
Salmon: A Suite for Acoustic Language Model Evaluation
Gallil Maimon
Amit Roth
Yossi Adi
ELM
AuLLM
49
5
0
11 Sep 2024
A Survey on Evaluation of Multimodal Large Language Models
A Survey on Evaluation of Multimodal Large Language Models
Jiaxing Huang
Jingyi Zhang
LM&MA
ELM
LRM
43
20
0
28 Aug 2024
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language
  Models
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
Yunwen Xia
Hui Fang
Emmanouil Benetos
Jie Zhang
Chong Long
Dmitry Bogdanov
AuLLM
41
1
0
02 Aug 2024
Qwen2-Audio Technical Report
Qwen2-Audio Technical Report
Yunfei Chu
Jin Xu
Qian Yang
Haojie Wei
Xipin Wei
...
Yuanjun Lv
Jinzheng He
Junyang Lin
Chang Zhou
Jingren Zhou
AuLLM
VLM
26
100
0
15 Jul 2024
We-Math: Does Your Large Multimodal Model Achieve Human-like
  Mathematical Reasoning?
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
Runqi Qiao
Qiuna Tan
Guanting Dong
Minhui Wu
Chong Sun
...
Yida Xu
Muxi Diao
Zhimin Bao
Chen Li
Honggang Zhang
VLM
LRM
39
30
0
01 Jul 2024
Towards Open Respiratory Acoustic Foundation Models: Pretraining and
  Benchmarking
Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
Yuwei Zhang
Tong Xia
Jing Han
Yu Wu
Georgios Rizos
Yang Liu
Mohammed Mosuily
Jagmohan Chauhan
Cecilia Mascolo
AI4CE
25
6
0
23 Jun 2024
AudioBench: A Universal Benchmark for Audio Large Language Models
AudioBench: A Universal Benchmark for Audio Large Language Models
Bin Wang
Xunlong Zou
Geyu Lin
S.
Zhuohan Liu
Wenyu Zhang
Zhengyuan Liu
AiTi Aw
Nancy F. Chen
AuLLM
ELM
LM&MA
85
17
0
23 Jun 2024
Understanding Sounds, Missing the Questions: The Challenge of Object
  Hallucination in Large Audio-Language Models
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
Chun-Yi Kuan
Wei-Ping Huang
Hung-yi Lee
AuLLM
19
1
0
12 Jun 2024
SLM: Bridge the thin gap between speech and text foundation models
SLM: Bridge the thin gap between speech and text foundation models
Mingqiu Wang
Wei Han
Izhak Shafran
Zelin Wu
Chung-Cheng Chiu
...
Zhong Meng
Golan Pundak
Nikhil Siddhartha
J. Schalkwyk
Yonghui Wu
AuLLM
37
56
0
30 Sep 2023
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
75
281
0
25 May 2022
1