Title |
---|
![]() Qwen2-Audio Technical Report Yunfei Chu Jin Xu Qian Yang Haojie Wei Xipin Wei ...Yuanjun Lv Jinzheng He Junyang Lin Chang Zhou Jingren Zhou |
![]() Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based
Speech Recognition Ye Bai Jingping Chen Jitong Chen Wei Chen Zhuo Chen ...Wanyi Zhang Yang Zhang Yawei Zhang Yijie Zheng Ming Zou |
![]() FunAudioLLM: Voice Understanding and Generation Foundation Models for
Natural Interaction Between Humans and LLMs Keyu An Qian Chen Chong Deng Zhihao Du Changfeng Gao ...Bin Zhang Qinglin Zhang Shiliang Zhang Nan Zhao Siqi Zheng |
![]() BESTOW: Efficient and Streamable Speech Language Model with the Best of
Two Worlds in GPT and T5 Zhehuai Chen He Huang Oleksii Hrinchuk Krishna C. Puvvada Nithin Rao Koluguri Piotr Żelasko Jagadeesh Balam Boris Ginsburg |
![]() Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with
Instruction Tuning Zebang Cheng Zhi-Qi Cheng Jun-Yan He Jingdong Sun Kai Wang Yuxiang Lin Zheng Lian Xiaojiang Peng Alexander G. Hauptmann |