Understanding Sounds, Missing the Questions: The Challenge of Object
Hallucination in Large Audio-Language Models

Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

12 June 2024

Wei-Ping Huang

Hung-yi Lee

ArXiv (abs)PDF HTML

Papers citing "Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models"

13 / 13 papers shown

Title
Adaptive vector steering: A training-free, layer-wise intervention for hallucination mitigation in large audio and multimodal models Tsung-En Lin Kuan-Yi Lee Hung-yi Lee LLMSV 152 0 0 14 Oct 2025
Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction Kaisi Guan Xihua Wang Zhengfeng Lai Xin Cheng Peng Zhang Xiaojiang Liu Ruihua Song Meng Cao DiffM 220 3 0 03 Oct 2025
Evaluating Hallucinations in Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions Hansol Park Hoseong Ahn Junwon Moon Yejin Lee Kyuhong Shim HILM 69 0 0 19 Sep 2025
Reducing Object Hallucination in Large Audio-Language Models via Audio-Aware Decoding Tzu-wen Hsu Ke-Han Lu Cheng-Han Chiang Hung-yi Lee AuLLM 231 3 0 08 Jun 2025
From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data Chun-Yi Kuan Hung-yi Lee AuLLM 258 0 0 26 May 2025
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Yongheng Zhang Xu Liu Ruoxi Zhou Qiguang Chen Hao Fei Wenpeng Lu L. Qin HILM LRM 153 6 0 25 May 2025
Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples Chun-Yi Kuan Hung-yi Lee 210 3 0 20 May 2025
SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information Chih-Kai Yang Neo Ho Yen-Ting Piao Hung-yi Lee AuLLM LRM 450 18 0 19 May 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey Siddhant Arora Kai-Wei Chang Chung-Ming Chien Yifan Peng Haibin Wu Yossi Adi Emmanuel Dupoux Hung-yi Lee Karen Livescu Shinji Watanabe 281 52 0 11 Apr 2025
Aligned Better, Listen Better for Audio-Visual Large Language ModelsInternational Conference on Learning Representations (ICLR), 2025 Yuxin Guo Shuailei Ma Shijie Ma Xiaoyi Bao Chen-Wei Xie Kecheng Zheng Tingyu Weng Siyang Sun Yun Zheng Wei Zou MLLM AuLLM 271 6 0 02 Apr 2025
Audio-Language Datasets of Scenes and Events: A SurveyIEEE Access (IEEE Access), 2024 Gijs Wijngaard Elia Formisano Michele Esposito M. Dumontier 374 6 0 10 Jan 2025
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio ReasoningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 Chun-Yi Kuan Hung-yi Lee AuLLM LRM 270 16 0 03 Jan 2025
Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation Chun-Yi Kuan Chih-Kai Yang Wei-Ping Huang Ke-Han Lu Hung-yi Lee 239 17 0 13 Jul 2024