AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios

Interspeech (Interspeech), 2022

1 April 2022

Xu Tan

Papers citing "AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios"

42 / 42 papers shown

Title
Zero-Shot Text-to-Speech for VietnameseAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Thi Vu L. T. Nguyen Dat Quoc Nguyen 174 2 0 02 Jun 2025
FMSD-TTS: Few-shot Multi-Speaker Multi-Dialect Text-to-Speech Synthesis for Ü-Tsang, Amdo and Kham Speech Dataset Generation Yutong Liu Ziyue Zhang Ban Ma-bao Yuqing Cai Yongbin Yu Renzeng Duojie Xiangxiang Wang Fan Gao Cheng Huang Nyima Tashi 225 3 0 20 May 2025
Voice Cloning: Comprehensive Survey Hussam Azzuni Abdulmotaleb El Saddik VLM 318 3 0 01 May 2025
DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis Yingahao Aaron Li Rithesh Kumar Zeyu Jin DiffM 317 0 0 21 Feb 2025
Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement Qianniu Chen Xiaoyang Hao Yangqiu Song Yunxing Liu Li Lu 202 0 0 15 Jan 2025
Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping Minki Kang Wooseok Han Eunho Yang CVBM 150 0 0 31 Dec 2024
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style DiffusionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Yinghao Aaron Li Xilin Jiang Cong Han N. Mesgarani DiffM 237 9 0 16 Sep 2024
On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition Nick Rossenbach Ralf Schluter S. Sakti 164 4 0 31 Jul 2024
Probing the Feasibility of Multilingual Speaker Anonymization Sarina Meyer Florian Lux Ngoc Thang Vu 219 8 0 03 Jul 2024
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and ReactionAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Haoqiu Yan Yongxin Zhu Kai Zheng Bing Liu Haoyu Cao Deqiang Jiang Linli Xu AuLLM 164 11 0 18 Jun 2024
Controlling Emotion in Text-to-Speech with Natural Language Prompts Thomas Bott Florian Lux Ngoc Thang Vu 266 13 0 10 Jun 2024
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes Trung D. Q. Dang David Aponte Dung Tran K. Koishida 227 13 0 05 Jun 2024
Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis Kun Zhou Shengkui Zhao Yukun Ma Chong Zhang Hao Wang Dianwen Ng Chongjia Ni Nguyen Trung Hieu J. Yip Bin Ma 182 6 0 04 Jun 2024
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach Wenbin Wang Yang Song Sanjay Jha 196 16 0 28 Apr 2024
PITCH: AI-assisted Tagging of Deepfake Audio Calls using Challenge-Response Govind Mittal Arthur Jakobsson Kelly O. Marshall Chinmay Hegde Nasir Memon 449 2 0 28 Feb 2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data Mateusz Lajszczak Guillermo Cámbara Yang Li Fatih Beyhan Arent van Korlaar ... Bartosz Putrycz Soledad López Gambino Kayeon Yoo Elena Sokolova Thomas Drugman LM&MA 334 110 0 12 Feb 2024
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation Minsu Kim Jeong Hun Yeo Se Jin Park J. Choi Y. Ro 221 7 0 18 Jan 2024
Pheme: Efficient and Conversational Speech Generation Paweł Budzianowski Taras Sereda Tomasz Cichy Ivan Vulić 157 10 0 05 Jan 2024
Self-Supervised Disentangled Representation Learning for Robust Target Speech ExtractionAAAI Conference on Artificial Intelligence (AAAI), 2023 Zhaoxi Mu Xinyu Yang Sining Sun Qing Yang SSL 241 12 0 16 Dec 2023
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech RepresentationComputer Vision and Pattern Recognition (CVPR), 2023 J. Choi Se Jin Park Minsu Kim Y. Ro 315 15 0 05 Dec 2023
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes Pavel Korshunov Haolin Chen Philip N. Garner S´ebastien Marcel CVBM 201 11 0 29 Nov 2023
Controllable Generation of Artificial Speaker Embeddings through Discovery of Principal DirectionsInterspeech (Interspeech), 2023 Florian Lux Pascal Tilli Sarina Meyer Ngoc Thang Vu 158 2 0 26 Oct 2023
The IMS Toucan System for the Blizzard Challenge 2023 Florian Lux Julia Koch Sarina Meyer Thomas Bott Nadja Schauffler Pavel Denisov Antje Schweitzer Ngoc Thang Vu 149 9 0 26 Oct 2023
Large-Scale Automatic Audiobook CreationInterspeech (Interspeech), 2023 Brendan Walsh Mark Hamilton Greg Newby Xi Wang Serena Ruan ... Lei He Shaofei Zhang Eric Dettinger William T. Freeman Markus Weimer 184 2 0 07 Sep 2023
PromptTTS 2: Describing and Generating Voices with Text Prompt Yichong Leng Zhifang Guo Kai Shen Xu Tan Zeqian Ju ... Lei He Xiang-Yang Li Sheng Zhao Tao Qin Jiang Bian VLM DiffM 244 67 0 05 Sep 2023
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-SpeechInterspeech (Interspeech), 2023 Hyungchan Yoon Changhwan Kim Eunwoo Song Hyun-Wook Yoon Hong-Goo Kang 138 2 0 28 Aug 2023
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled RepresentationsInterspeech (Interspeech), 2023 Wen Wang Yang Song S. Jha 139 14 0 24 Aug 2023
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech SynthesisInternational Conference on Learning Representations (ICLR), 2023 Ziyue Jiang Jinglin Liu Yi Ren Jinzheng He Zhe Ye ... Pengfei Wei Chunfeng Wang Xiang Yin Zejun Ma Zhou Zhao 248 71 0 14 Jul 2023
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to SpeechSpeech Synthesis Workshop (SSW), 2023 Daria Diatlova V. Shutov 172 21 0 28 Jun 2023
MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker ModulationInterspeech (Interspeech), 2023 Jun Chen Wei Rao Zehao Wang Jiuxin Lin Yukai Ju Shulin He Yannan Wang Zhiyong Wu 158 19 0 28 Jun 2023
UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed DataInterspeech (Interspeech), 2023 Heeseung Kim Sungwon Kim Ji-Ran Yeom Sung-Wan Yoon DiffM 161 26 0 28 Jun 2023
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias Ziyue Jiang Yi Ren Zhe Ye Jinglin Liu Chen Zhang ... Rongjie Huang Chunfeng Wang Xiang Yin Zejun Ma Zhou Zhao DiffM 216 95 0 06 Jun 2023
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS AdaptationInterspeech (Interspeech), 2023 Ambuj Mehrish Abhinav Ramesh Kashyap Yingting Li Navonil Majumder Soujanya Poria 160 13 0 29 May 2023
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based ModelsInterspeech (Interspeech), 2023 Minki Kang Wooseok Han Sung Ju Hwang Eunho Yang DiffM 161 27 0 23 May 2023
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource ScenariosInterspeech (Interspeech), 2023 Yuyue Wang Huanhou Xiao Yihan Wu Ruihua Song 102 0 0 20 May 2023
Neural Codec Language Models are Zero-Shot Text to Speech SynthesizersIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2023 Chengyi Wang Sanyuan Chen Yu-Huan Wu Zi-Hua Zhang Long Zhou ... Huaming Wang Jinyu Li Lei He Sheng Zhao Furu Wei 366 996 0 05 Jan 2023
Memories are One-to-Many Mapping Alleviators in Talking Face GenerationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022 Anni Tang Tianyu He Xuejiao Tan Jun Ling Liang Song CVBM 293 27 0 09 Dec 2022
VideoDubber: Machine Translation with Speech-Aware Length Control for Video DubbingAAAI Conference on Artificial Intelligence (AAAI), 2022 Yihan Wu Junliang Guo Xuejiao Tan Chen Zhang Bohan Li Ruihua Song Lei He Sheng Zhao Arul Menezes Jiang Bian 135 25 0 30 Nov 2022
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation Nobuyuki Morioka Heiga Zen Nanxin Chen Yu Zhang Yifan Ding 162 18 0 28 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS Florian Lux Julia Koch Ngoc Thang Vu 198 25 0 21 Oct 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data Sungwon Kim Heeseung Kim Sung-Hoon Yoon DiffM 375 61 0 30 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level QualityIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022 Xu Tan Jiawei Chen Haohe Liu Jian Cong Chen Zhang ... Lei He Frank Soong Tao Qin Sheng Zhao Tie-Yan Liu 293 283 0 09 May 2022