Title |
---|
![]() Speech-Mamba: Long-Context Speech Recognition with Selective State
Spaces Models Xiaoxue Gao Nancy F. Chen |
![]() EMMeTT: Efficient Multimodal Machine Translation Training Piotr Żelasko Zhehuai Chen Mengru Wang Daniel Galvez Oleksii Hrinchuk Shuoyang Ding Ke Hu Jagadeesh Balam Vitaly Lavrukhin Boris Ginsburg |
![]() AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost Ahmet Gündüz Yunsu Kim Kamer Ali Yuksel Mohamed Al-Badrashiny Thiago Castro Ferreira Hassan Sawaf |
![]() Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based
Speech Recognition Ye Bai Jingping Chen Jitong Chen Wei Chen Zhuo Chen ...Wanyi Zhang Yang Zhang Yawei Zhang Yijie Zheng Ming Zou |
![]() BESTOW: Efficient and Streamable Speech Language Model with the Best of
Two Worlds in GPT and T5 Zhehuai Chen He Huang Oleksii Hrinchuk Krishna C. Puvvada Nithin Rao Koluguri Piotr Żelasko Jagadeesh Balam Boris Ginsburg |