One TTS Alignment To Rule Them All

One TTS Alignment To Rule Them All

23 August 2021

Bryan Catanzaro

Papers citing "One TTS Alignment To Rule Them All"

18 / 18 papers shown

Title
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance Shehzeen Samarah Hussain Paarth Neekhara Xuesong Yang Edresson Casanova Subhankar Ghosh Mikyas T. Desta Roy Fejgin Rafael Valle Jason Chun Lok Li 59 2 0 07 Feb 2025
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation Haibo Tong Zhaoyang Wang Z. Chen Haonian Ji Shi Qiu ... Peng Xia Mingyu Ding Rafael Rafailov Chelsea Finn Huaxiu Yao EGVM VGen 102 2 0 03 Feb 2025
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation Ji-Hoon Kim Hong-Sun Yang Yoon-Cheol Ju Il-Hwan Kim Byeong-Yeol Kim Joon Son Chung BDL 49 0 0 31 Dec 2024
E1 TTS: Simple and Fast Non-Autoregressive TTS Zhijun Liu Shuai Wang Pengcheng Zhu Mengxiao Bi Haizhou Li VLM DiffM 38 3 0 14 Sep 2024
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation C. Han Seokgi Lee Gyuhyeon Nam Gyeongsu Chae DiffM 121 0 0 14 Sep 2024
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment Paarth Neekhara Shehzeen Samarah Hussain Subhankar Ghosh Jason Chun Lok Li Rafael Valle Rohan Badlani Boris Ginsburg 52 11 0 25 Jun 2024
The DeepZen Speech Synthesis System for Blizzard Challenge 2023 C. Veaux R. Maia Spyridoula Papendreou 20 1 0 30 Aug 2023
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis Siyang Wang G. Henter Joakim Gustafson Éva Székely 42 5 0 11 Jul 2023
PITS: Variational Pitch Inference without Fundamental Frequency for End-to-End Pitch-controllable TTS Junhyeok Lee Wonbin Jung Hyunjae Cho Jaeyeon Kim Jaehwan Kim 17 3 0 24 Feb 2023
RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis Shinhyeok Oh HyeongRae Noh Yoonseok Hong Insoo Oh 18 0 0 15 Dec 2022
Towards Building Text-To-Speech Systems for the Next Billion Users Gokul Karthik Kumar V. PraveenS. Pratyush Kumar Mitesh M. Khapra Karthik Nandakumar 36 18 0 17 Nov 2022
OverFlow: Putting flows on top of neural transducers for better TTS Shivam Mehta Ambika Kirkland Harm Lameris Jonas Beskow Éva Székely G. Henter AI4TS 32 12 0 13 Nov 2022
Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder J. Melechovský Ambuj Mehrish Berrak Sisman Dorien Herremans 21 6 0 07 Nov 2022
Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers Cheng-Ping Hsieh Subhankar Ghosh Boris Ginsburg 41 18 0 01 Nov 2022
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis Yifan Hu Rui Liu Guanglai Gao Haizhou Li 72 7 0 27 Oct 2022
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion Dacheng Yin Chuanxin Tang Yanqing Liu Xiaoqiang Wang Zhiyuan Zhao Yucheng Zhao Zhiwei Xiong Sheng Zhao Chong Luo 18 12 0 28 Jun 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech D. Lim Sunghee Jung Eesung Kim 17 51 0 31 Mar 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs Songxiang Liu Dan Su Dong Yu DiffM 68 65 0 28 Jan 2022