ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04558
  4. Cited By
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

8 June 2020
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
ArXivPDFHTML

Papers citing "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

50 / 754 papers shown
Title
Systematic Inequalities in Language Technology Performance across the
  World's Languages
Systematic Inequalities in Language Technology Performance across the World's Languages
Damián E. Blasi
Antonios Anastasopoulos
Graham Neubig
116
131
0
13 Oct 2021
Pitch Preservation In Singing Voice Synthesis
Pitch Preservation In Singing Voice Synthesis
Shujun Liu
Hai Zhu
Kun Wang
Huajun Wang
21
0
0
11 Oct 2021
PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS
  With Accurate Phoneme Duration Control
PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS With Accurate Phoneme Duration Control
Yunchao He
Jian Luan
Yujun Wang
15
1
0
09 Oct 2021
Environment Aware Text-to-Speech Synthesis
Environment Aware Text-to-Speech Synthesis
Daxin Tan
Guangyan Zhang
Tan Lee
13
3
0
08 Oct 2021
Phone-to-audio alignment without text: A Semi-supervised Approach
Phone-to-audio alignment without text: A Semi-supervised Approach
Jian Zhu
Cong Zhang
David Jurgens
23
36
0
08 Oct 2021
A study on the efficacy of model pre-training in developing neural
  text-to-speech system
A study on the efficacy of model pre-training in developing neural text-to-speech system
Guangyan Zhang
Yichong Leng
Daxin Tan
Ying Qin
Kaitao Song
Xu Tan
Sheng Zhao
Tan Lee
19
2
0
08 Oct 2021
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic
  Voice Over
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over
Junchen Lu
Berrak Sisman
Rui Liu
Mingyang Zhang
Haizhou Li
DiffM
32
19
0
07 Oct 2021
Emphasis control for parallel neural TTS
Emphasis control for parallel neural TTS
Shreyas Seshadri
T. Raitio
D. Castellani
Jiangchuan Li
50
11
0
06 Oct 2021
Hierarchical prosody modeling and control in non-autoregressive parallel
  neural TTS
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS
T. Raitio
Jiangchuan Li
Shreyas Seshadri
32
22
0
06 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative
  Sequence Models
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
Jen-Hao Rick Chang
A. Shrivastava
H. Koppula
Xiaoshuai Zhang
Oncel Tuzel
DiffM
51
16
0
06 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and
  Prosody in Speech Synthesis
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis
Cheng-I Jeff Lai
Erica Cooper
Yang Zhang
Shiyu Chang
Kaizhi Qian
...
Yung-Sung Chuang
Alexander H. Liu
Junichi Yamagishi
David D. Cox
James R. Glass
26
6
0
04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Yi Ren
Jinglin Liu
Zhou Zhao
42
78
0
30 Sep 2021
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context
  Prediction Network
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network
Takaaki Saeki
Shinnosuke Takamichi
Hiroshi Saruwatari
26
3
0
22 Sep 2021
On-device neural speech synthesis
On-device neural speech synthesis
Sivanand Achanta
Albert Antony
L. Golipour
Jiangchuan Li
T. Raitio
...
Francesco Rossi
Jennifer Shi
Jaimin Upadhyay
David Winarsky
Hepeng Zhang
27
17
0
17 Sep 2021
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Changhan Wang
Wei-Ning Hsu
Yossi Adi
Adam Polyak
Ann Lee
Peng-Jen Chen
Jiatao Gu
J. Pino
VLM
67
32
0
14 Sep 2021
Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Chuanxin Tang
Chong Luo
Zhiyuan Zhao
Dacheng Yin
Yucheng Zhao
Wenjun Zeng
24
9
0
12 Sep 2021
Referee: Towards reference-free cross-speaker style transfer with
  low-quality data for expressive speech synthesis
Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis
Songxiang Liu
Shan Yang
Dan Su
Dong Yu
AI4TS
19
10
0
08 Sep 2021
Text-Free Prosody-Aware Generative Spoken Language Modeling
Text-Free Prosody-Aware Generative Spoken Language Modeling
Eugene Kharitonov
Ann Lee
Adam Polyak
Yossi Adi
Jade Copet
...
Tu Nguyen
M. Rivière
Abdel-rahman Mohamed
Emmanuel Dupoux
Wei-Ning Hsu
30
116
0
07 Sep 2021
One TTS Alignment To Rule Them All
One TTS Alignment To Rule Them All
Rohan Badlani
A. Lancucki
Kevin J. Shih
Rafael Valle
Wei Ping
Bryan Catanzaro
19
82
0
23 Aug 2021
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
Ji-Hoon Kim
Sang-Hoon Lee
Ji-Hyun Lee
Hong G Jung
Seong-Whan Lee
23
6
0
16 Aug 2021
Masked Acoustic Unit for Mispronunciation Detection and Correction
Masked Acoustic Unit for Mispronunciation Detection and Correction
Zhan Zhang
Yuehai Wang
Jianyi Yang
25
3
0
12 Aug 2021
An Empirical Study on End-to-End Singing Voice Synthesis with
  Encoder-Decoder Architectures
An Empirical Study on End-to-End Singing Voice Synthesis with Encoder-Decoder Architectures
Dengfeng Ke
Yuxing Lu
Xudong Liu
Yanyan Xu
Jing Sun
Cheng-Hao Cai
28
0
0
06 Aug 2021
Applying the Information Bottleneck Principle to Prosodic Representation
  Learning
Applying the Information Bottleneck Principle to Prosodic Representation Learning
Guangyan Zhang
Ying Qin
Daxin Tan
Tan Lee
22
4
0
05 Aug 2021
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive
  Speech Synthesis
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis
Julian Zaïdi
Hugo Seuté
Benjamin van Niekerk
M. Carbonneau
20
20
0
04 Aug 2021
Creation and Detection of German Voice Deepfakes
Creation and Detection of German Voice Deepfakes
Vanessa Barnekow
Dominik Binder
Niclas Kromrey
Pascal Munaretto
A. Schaad
Felix Schmieder
16
1
0
02 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
24
7
0
01 Aug 2021
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal
  Language
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language
Huiyan Li
Haohong Lin
You Wang
Hengyang Wang
Ming Zhang
Han Gao
Qing Ai
Zhiyuan Luo
Guang Li
21
11
0
31 Jul 2021
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
Joanna Rownicka
Kilian Sprenkamp
A. Tripiana
Volodymyr Gromoglasov
Timo P. Kunz
19
0
0
21 Jul 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
Direct speech-to-speech translation with discrete units
Direct speech-to-speech translation with discrete units
Ann Lee
Peng-Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
...
Yossi Adi
Qing He
Yun Tang
J. Pino
Wei-Ning Hsu
25
180
0
12 Jul 2021
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive
  Text-to-Speech Synthesis
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis
Hui Lu
Zhiyong Wu
Xixin Wu
Xu Li
Shiyin Kang
Xunying Liu
H. Meng
20
12
0
07 Jul 2021
Msdtron: a high-capability multi-speaker speech synthesis system for
  diverse data using characteristic information
Msdtron: a high-capability multi-speaker speech synthesis system for diverse data using characteristic information
Qinghua Wu
Quanbo Shen
Jian Luan
YuJun Wang
28
3
0
07 Jul 2021
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style
Yuzi Yan
Xu Tan
Bohan Li
Guangyan Zhang
Tao Qin
Sheng Zhao
Yuan-Chung Shen
Weiqiang Zhang
Tie-Yan Liu
9
20
0
06 Jul 2021
EditSpeech: A Text Based Speech Editing System Using Partial Inference
  and Bidirectional Fusion
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion
Daxin Tan
Liqun Deng
Y. Yeung
Xin Jiang
Xiao Chen
Tan Lee
26
37
0
04 Jul 2021
On the Generative Utility of Cyclic Conditionals
On the Generative Utility of Cyclic Conditionals
Chang-Shu Liu
Haoyue Tang
Tao Qin
Jintao Wang
Tie-Yan Liu
37
3
0
30 Jun 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Ammar Abbas
Bajibabu Bollepalli
Alexis Moinet
Arnaud Joly
Penny Karanasou
Peter Makarov
Simon Slangens
S. Karlapati
Thomas Drugman
16
0
0
29 Jun 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech
  Synthesis
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Jinhyeok Yang
Jaesung Bae
Taejun Bak
Young-Ik Kim
Hoon-Young Cho
23
36
0
29 Jun 2021
FastPitchFormant: Source-filter based Decomposed Modeling for Speech
  Synthesis
FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
Taejun Bak
Jaesung Bae
Hanbin Bae
Young-Ik Kim
Hoon-Young Cho
17
16
0
29 Jun 2021
Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource
  Highly Expressive Speech
Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech
Raahil Shah
Kamil Pokora
Abdelhamid Ezzerg
V. Klimkov
Goeric Huybrechts
Bartosz Putrycz
Daniel Korzekwa
Thomas Merritt
22
25
0
24 Jun 2021
Distilling the Knowledge from Conditional Normalizing Flows
Distilling the Knowledge from Conditional Normalizing Flows
Dmitry Baranchuk
Vladimir Aliev
Artem Babenko
BDL
28
2
0
24 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style
  Control
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
M. Kang
Sungjae Kim
Injung Kim
21
3
0
21 Jun 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational
  Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Jian Cong
Shan Yang
Lei Xie
Dan Su
DRL
16
29
0
21 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
16
88
0
17 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
  Text-to-Speech Model
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model
Chenye Cui
Yi Ren
Jinglin Liu
Feiyang Chen
Rongjie Huang
Ming Lei
Zhou Zhao
18
34
0
17 Jun 2021
WSRGlow: A Glow-based Waveform Generative Model for Audio
  Super-Resolution
WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution
Kexun Zhang
Yi Ren
Changliang Xu
Zhou Zhao
31
29
0
16 Jun 2021
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
Rohola Zandie
Mohammad H. Mahoor
Julia Madsen
Eshrat S. Emamian
21
24
0
15 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
D. Mohan
Qinmin Hu
Tian Huey Teh
Alexandra Torresquintero
C. Wallis
Marlene Staib
Lorenzo Foglianti
Jiameng Gao
Simon King
20
16
0
15 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram
  Discriminators for High-Fidelity Waveform Generation
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Won Jang
D. Lim
Jaesam Yoon
Bongwan Kim
Juntae Kim
18
125
0
15 Jun 2021
PriorGrad: Improving Conditional Denoising Diffusion Models with
  Data-Dependent Adaptive Prior
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior
Sang-gil Lee
Heeseung Kim
Chaehun Shin
Xu Tan
Chang-Shu Liu
Qi Meng
Tao Qin
Wei Chen
Sung-Hoon Yoon
Tie-Yan Liu
DiffM
18
81
0
11 Jun 2021
Previous
123...13141516
Next