Conversational End-to-End TTS for Voice Agent

v1v2 (latest)

Conversational End-to-End TTS for Voice Agent

21 May 2020

Lei Xie

ArXiv (abs)PDF HTML

Papers citing "Conversational End-to-End TTS for Voice Agent"

17 / 17 papers shown

Title
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching Leying Zhang Y. Qian Xiaofei Wang Manthan Thakker Dongmei Wang ... Haibin Wu Yuxuan Hu Jinyu Li Yanmin Qian Sheng Zhao 43 0 0 01 Jun 2025
Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis Rui Liu Zhenqi Jia F. Bao Hong Li 77 2 0 11 Jan 2025
The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024 Shuoyi Zhou Yixuan Zhou Weiqing Li Jun Chen Runchuan Ye Weihao Wu Zijian Lin Shun Lei Zhiyong Wu 172 1 0 02 Dec 2024
FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications Hao-Han Guo Kun Liu Fei-Yu Shen Yi-Chen Wu Xu Tang Kun Xie Kai-Tuo Xu Kun Xie Kai-Tuo Xu 92 28 0 05 Sep 2024
Generative Expressive Conversational Speech Synthesis Rui Liu Yifan Hu Yi Ren Xiang Yin Haizhou Li 119 6 0 31 Jul 2024
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling Rui Liu Yifan Hu Yi Ren Xiang Yin Haizhou Li 97 19 0 19 Dec 2023
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading Yujia Xiao Shaofei Zhang Xi Wang Xuejiao Tan Lei He Sheng Zhao Frank Soong Tan Lee 46 6 0 03 Jul 2023
M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis Jinlong Xue Yayue Deng Fengping Wang Ya Li Yingming Gao J. Tao Jianqing Sun Jiaen Liang 68 10 0 03 May 2023
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis Yifan Hu Rui Liu Guanglai Gao Haizhou Li 383 8 0 27 Oct 2022
Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations Haohan Guo Fenglong Xie Xixin Wu Hui Lu Helen Meng 328 3 0 27 Oct 2022
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS Liumeng Xue Frank Soong Shaofei Zhang Linfu Xie 73 23 0 14 Sep 2022
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue Kentaro Mitsui Tianyu Zhao Kei Sawada Yukiya Hono Yoshihiko Nankaku K. Tokuda 67 14 0 24 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History Yuto Nishimura Yuki Saito Shinnosuke Takamichi Kentaro Tachibana Hiroshi Saruwatari AI4TS 59 8 0 16 Jun 2022
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent Yuki Saito Yuto Nishimura Shinnosuke Takamichi Kentaro Tachibana Hiroshi Saruwatari 126 12 0 28 Mar 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis Shinnosuke Takamichi Wataru Nakata Naoko Tanji Hiroshi Saruwatari AuLLM 77 7 0 26 Jan 2022
Controllable Context-aware Conversational Speech Synthesis Jian Cong Shan Yang Na Hu Guangzhi Li Lei Xie Jane Polak Scowcroft 73 30 0 21 Jun 2021
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling Jingbei Li Yi Meng Chenyi Li Zhiyong Wu Helen Meng Chao Weng Jane Polak Scowcroft 93 24 0 11 Jun 2021