Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1809.08895
Cited By
v1
v2
v3 (latest)
Neural Speech Synthesis with Transformer Network
19 September 2018
Naihan Li
Shujie Liu
Yanqing Liu
Sheng Zhao
Ming-Yuan Liu
M. Zhou
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural Speech Synthesis with Transformer Network"
37 / 37 papers shown
Title
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Dongyang Dai
Zhiyong Wu
Shiyin Kang
Xixin Wu
Jia Jia
Dan Su
Dong Yu
Helen Meng
95
26
0
03 Jan 2025
Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder
Yusuke Yasuda
Tomoki Toda
DiffM
79
8
0
16 Dec 2022
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Yusuke Yasuda
Tomoki Toda
121
10
0
16 Dec 2022
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
87
47
0
11 Aug 2022
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
85
63
0
15 Oct 2021
Controllable Context-aware Conversational Speech Synthesis
Jian Cong
Shan Yang
Na Hu
Guangzhi Li
Lei Xie
Jane Polak Scowcroft
73
30
0
21 Jun 2021
Phone-Level Prosody Modelling with GMM-Based MDN for Diverse and Controllable Speech Synthesis
Chenpeng Du
K. Yu
154
20
0
27 May 2021
Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram
Shengkui Zhao
Hao Wang
Trung Hieu Nguyen
B. Ma
51
20
0
03 Feb 2021
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Shinji Watanabe
Florian Boyer
Xuankai Chang
Pengcheng Guo
Tomoki Hayashi
...
Shigeki Karita
Chenda Li
Jing Shi
Aswin Shanmugam Subramanian
Wangyou Zhang
VLM
108
38
0
23 Dec 2020
Few Shot Adaptive Normalization Driven Multi-Speaker Speech Synthesis
Neeraj Kumar
Srishti Goel
Ankur Narang
Brejesh Lall
68
5
0
14 Dec 2020
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture
Chenfeng Miao
Shuang Liang
Zhencheng Liu
Minchuan Chen
Jun Ma
Shaojun Wang
Jing Xiao
67
38
0
07 Dec 2020
Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Shengkui Zhao
Trung Hieu Nguyen
Hao Wang
B. Ma
60
25
0
16 Oct 2020
Exploration of End-to-end Synthesisers forZero Resource Speech Challenge 2020
Karthik Pandia D.S.
Anusha Prakash
M. M.
H. Murthy
42
4
0
10 Sep 2020
A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems
Phan Huy Kinh
V. Phung
Anh-Tuan Dinh
Quoc Bao Nguyen
27
1
0
26 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
70
32
0
12 May 2020
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
Shan Yang
Yuxuan Wang
Lei Xie
66
10
0
28 Apr 2020
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System
V. Phung
Phan Huy Kinh
Anh-Tuan Dinh
Quoc Bao Nguyen
35
5
0
20 Apr 2020
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
Jian Zhu
62
8
0
23 Dec 2019
Emotional Voice Conversion using Multitask Learning with Text-to-speech
Tae-Ho Kim
Sungjae Cho
Shinkook Choi
Sejik Park
Soo-Young Lee
92
40
0
11 Nov 2019
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
25
2
0
28 Oct 2019
Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Kazuhiro Nakamura
Shinji Takaki
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
84
19
0
24 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
Tomoki Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
93
205
0
24 Oct 2019
Attention Forcing for Sequence-to-sequence Model Training
Qingyun Dou
Yiting Lu
Joshua Efiong
Mark Gales
62
6
0
26 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
Chengzhu Yu
Heng Lu
Na Hu
Meng Yu
Chao Weng
...
Deyi Tuo
Shiyin Kang
Guangzhi Lei
Jane Polak Scowcroft
Dong Yu
CVBM
85
118
0
04 Sep 2019
Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
58
8
0
30 Aug 2019
Maximizing Mutual Information for Tacotron
Peng Liu
Xixin Wu
Shiyin Kang
Guangzhi Li
Jane Polak Scowcroft
Dong Yu
86
16
0
30 Aug 2019
Forward-Backward Decoding for Regularizing End-to-End TTS
Yibin Zheng
Xi Wang
Lei He
Shifeng Pan
Frank Soong
Zhengqi Wen
J. Tao
48
13
0
18 Jul 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
95
102
0
13 May 2019
The Zero Resource Speech Challenge 2019: TTS without T
Ewan Dunbar
Robin Algayres
Julien Karadayi
Mathieu Bernard
Juan Benjumea
...
Lucas Ondel
A. Black
Laurent Besacier
S. Sakti
Emmanuel Dupoux
94
117
0
25 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
95
47
0
09 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
164
959
0
05 Apr 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
N. Prateek
Mateusz Lajszczak
Roberto Barra-Chicote
Thomas Drugman
Jaime Lorenzo-Trueba
Thomas Merritt
S. Ronanki
Trevor Wood
84
30
0
04 Apr 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis
Yanyao Bian
Changbin Chen
Yongguo Kang
Zhenglin Pan
77
46
0
04 Apr 2019
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet
Mingyang Zhang
Xin Wang
Fuming Fang
Haizhou Li
Junichi Yamagishi
70
50
0
29 Mar 2019
FPETS : Fully Parallel End-to-End Text-to-Speech System
Dabiao Ma
Zhiba Su
Wenxuan Wang
Yuhao Lu
58
6
0
12 Dec 2018
Learning latent representations for style control and transfer in end-to-end speech synthesis
Ya-Jie Zhang
Shifeng Pan
Lei He
Zhenhua Ling
BDL
SSL
DRL
94
229
0
11 Dec 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Yusuke Yasuda
Xin Wang
Shinji Takaki
Junichi Yamagishi
63
87
0
29 Oct 2018
1