Title |
---|
![]() WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Shengpeng Ji Ziyue Jiang Xize Cheng Yifu Chen Minghui Fang ...Rongjie Huang Yidi Jiang Qian Chen Zhou Zhao Zhou Zhao |
![]() ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and
Zero-shot Language Style Control With Decoupled Codec Shengpeng Ji Jia-li Zuo Minghui Fang Siqi Zheng Qian Chen ...Ziyue Jiang Hai Huang Xize Cheng Rongjie Huang Zhou Zhao |
![]() UniAudio: An Audio Foundation Model Toward Universal Audio Generation Dongchao Yang Jinchuan Tian Xuejiao Tan Rongjie Huang Songxiang Liu ...Jiang Bian Xixin Wu Zhou Zhao Shinji Watanabe Helen M. Meng |
![]() Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with
Multi-Scale Acoustic Prompts Shunwei Lei Yixuan Zhou Liyang Chen Dan Luo Zhiyong Wu ...Shiyin Kang Tao Jiang Yahui Zhou Yuxing Han Helen M. Meng |