Slamming: Training a Speech Language Model on One GPU in a DayAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic EmbeddingsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Soundwave: Less is More for Speech-Text Alignment in LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond WordsNeural Information Processing Systems (NeurIPS), 2024 |
Audio-Language Datasets of Scenes and Events: A SurveyIEEE Access (IEEE Access), 2024 |
Prepending or Cross-Attention for Speech-to-Text? An Empirical ComparisonNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025 |
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio ReasoningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Foundation Models for Rapid Autonomy ValidationIEEE International Conference on Robotics and Automation (ICRA), 2024 |
Self-Powered LLM Modality Expansion for Large Speech-Text ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Enabling Auditory Large Language Models for Automatic Speech Quality EvaluationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 Siyin Wang Wenyi Yu Yudong Yang Changli Tang Yixuan Li ...Jun Zhang Guangzhi Sun Lu Lu Yuxuan Wang Chao Zhang |
Boosting Code-Switching ASR with Mixture of Experts Enhanced
Speech-Conditioned LLMIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
What Are They Doing? Joint Audio-Speech Co-ReasoningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile InstructionsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Salmon: A Suite for Acoustic Language Model EvaluationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
MoWE-Audio: Multitask AudioLLMs with Mixture of Weak EncodersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
LLaMA-Omni: Seamless Speech Interaction with Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024 |