ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05884
  4. Cited By
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
v1v2 (latest)

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

16 December 2017
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
Zongheng Yang
Zhiwen Chen
Yu Zhang
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
ArXiv (abs)PDFHTML

Papers citing "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions"

50 / 1,276 papers shown
Title
Robust Front-End for Multi-Channel ASR using Flow-Based Density
  Estimation
Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation
Xiaoyuan Yi
Hyeonseung Lee
Wenhao Li
Hyung Yong Kim
Nam Soo Kim
84
22
0
25 Jul 2020
A Transfer Learning End-to-End ArabicText-To-Speech (TTS) Deep
  Architecture
A Transfer Learning End-to-End ArabicText-To-Speech (TTS) Deep Architecture
Fady K. Fahmy
M. Khalil
Hazem M. Abbas
53
21
0
22 Jul 2020
Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech
  Recognition
Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition
Ludwig Kurzinger
Edgar Ricardo Chavez Rosas
Lujun Li
Tobias Watzel
Gerhard Rigoll
AAML
50
4
0
21 Jul 2020
Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning
  With Spoofing Detection and Spoofing Type Classification
Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification
Yeunju Choi
Youngmoon Jung
Hoirin Kim
105
27
0
16 Jul 2020
Generating Visually Aligned Sound from Videos
Generating Visually Aligned Sound from Videos
Peihao Chen
Yang Zhang
Mingkui Tan
Hongdong Xiao
Deng Huang
Chuang Gan
VGen
114
97
0
14 Jul 2020
Xiaomingbot: A Multilingual Robot News Reporter
Xiaomingbot: A Multilingual Robot News Reporter
Runxin Xu
Jun Cao
Mingxuan Wang
Jiaze Chen
Hao Zhou
...
Xiang Yin
Xijin Zhang
Songcheng Jiang
Yuxuan Wang
Lei Li
77
11
0
12 Jul 2020
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model
  with Pitch-dependent Dilated Convolution Neural Network
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network
Yi-Chiao Wu
Tomoki Hayashi
Patrick Lumban Tobing
Kazuhiro Kobayashi
Tomoki Toda
50
18
0
11 Jul 2020
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
Yi Ren
Xu Tan
Tao Qin
Jian Luan
Zhou Zhao
Tie-Yan Liu
112
73
0
09 Jul 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic
  Sharding
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
186
1,199
0
30 Jun 2020
Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech
  Synthesis
Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech Synthesis
Antti Suni
Sofoklis Kakouros
M. Vainio
J. Šimko
68
18
0
29 Jun 2020
Audeo: Audio Generation for a Silent Performance Video
Audeo: Audio Generation for a Silent Performance Video
Kun Su
Xiulong Liu
Eli Shlizerman
VGen
87
69
0
23 Jun 2020
Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory
  Inversion
Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion
Narjes Bozorg
Michael T.Johnson
41
1
0
22 Jun 2020
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing
  Synthesizer
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer
Jie Wu
Jian Luan
73
26
0
18 Jun 2020
Implicit Neural Representations with Periodic Activation Functions
Implicit Neural Representations with Periodic Activation Functions
Vincent Sitzmann
Julien N. P. Martel
Alexander W. Bergman
David B. Lindell
Gordon Wetzstein
AI4TS
242
2,585
0
17 Jun 2020
Adversarial representation learning for private speech generation
Adversarial representation learning for private speech generation
David Ericsson
Adam Östberg
Edvin Listo Zec
John Martinsson
Olof Mogren
53
17
0
16 Jun 2020
Neural voice cloning with a few low-quality samples
Neural voice cloning with a few low-quality samples
Sunghee Jung
Hoi-Rim Kim
37
3
0
12 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Lañcucki
117
342
0
11 Jun 2020
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech
  Deep Features in Adversarial Networks
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Jiaqi Su
Zeyu Jin
Adam Finkelstein
69
140
0
10 Jun 2020
Deep generative models for musical audio synthesis
Deep generative models for musical audio synthesis
M. Huzaifah
L. Wyse
210
20
0
10 Jun 2020
MultiSpeech: Multi-Speaker Text to Speech with Transformer
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian Chen
Xu Tan
Yi Ren
Jin Xu
Hao Sun
Sheng Zhao
Tao Qin
Tie-Yan Liu
65
110
0
08 Jun 2020
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Hyeongju Kim
Hyeongseung Lee
Woohyun Kang
Sung Jun Cheon
Byoung Jin Choi
N. Kim
67
12
0
08 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
155
1,415
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
85
187
0
05 Jun 2020
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
Arun Baby
Saranya Vinnaitherthan
Nagaraj Adiga
Pranav Jawale
Sumukh Badam
Sharath Adavanne
Srikanth Konjeti
49
7
0
02 Jun 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Da-Yi Wu
Yi-Hsuan Yang
GAN
74
8
0
28 May 2020
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake
  Voices
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices
Run Wang
Felix Juefei Xu
Yihao Huang
Qing Guo
Xiaofei Xie
Lei Ma
Yang Liu
AAML
80
107
0
28 May 2020
A comparison of Vietnamese Statistical Parametric Speech Synthesis
  Systems
A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems
Phan Huy Kinh
V. Phung
Anh-Tuan Dinh
Quoc Bao Nguyen
27
1
0
26 May 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
130
498
0
22 May 2020
NAUTILUS: a Versatile Voice Cloning System
NAUTILUS: a Versatile Voice Cloning System
Hieu-Thi Luong
Junichi Yamagishi
100
53
0
22 May 2020
Pitchtron: Towards audiobook generation from ordinary people's voices
Pitchtron: Towards audiobook generation from ordinary people's voices
Sunghee Jung
Hoi-Rim Kim
41
5
0
21 May 2020
Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario
Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario
Zexin Cai
Yaogen Yang
Ming Li
26
9
0
21 May 2020
Conversational End-to-End TTS for Voice Agent
Conversational End-to-End TTS for Voice Agent
Haohan Guo
Shaofei Zhang
Frank Soong
Lei He
Lei Xie
84
69
0
21 May 2020
Investigation of learning abilities on linguistic features in
  sequence-to-sequence text-to-speech synthesis
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
76
31
0
20 May 2020
Improving Accent Conversion with Reference Encoder and End-To-End
  Text-To-Speech
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Wenjie Li
Benlai Tang
Xiang Yin
Yushi Zhao
Wei Li
Kang Wang
Hao Huang
Yuxuan Wang
Zejun Ma
70
13
0
19 May 2020
A Cyclical Post-filtering Approach to Mismatch Refinement of Neural
  Vocoder for Text-to-speech Systems
A Cyclical Post-filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-speech Systems
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
Tomoki Toda
50
5
0
18 May 2020
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with
  Monotonic Boundary Search
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search
Naihan Li
Shujie Liu
Yanqing Liu
Sheng Zhao
Ming-Yuan Liu
Ming Zhou
50
6
0
18 May 2020
Unconditional Audio Generation with Generative Adversarial Networks and
  Cycle Regularization
Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization
Jen-Yu Liu
Yu-Hua Chen
Yin-Cheng Yeh
Yi-Hsuan Yang
GAN
71
35
0
18 May 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based
  Variable-Length Embedding
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding
Seungwoo Choi
Seungju Han
Dongyoung Kim
S. Ha
91
67
0
18 May 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
Tomoki Toda
ViT
94
30
0
18 May 2020
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
71
113
0
17 May 2020
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis
  Using Discrete Speech Representation
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Tao Tu
Yuan-Jui Chen
Alexander H. Liu
Hung-yi Lee
54
7
0
16 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech
  without Explicit Alignment
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
D. Lim
Won Jang
Gyeonghwan O
Heayoung Park
Bongwan Kim
Jaesam Yoon
71
37
0
15 May 2020
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU
Po-Chun Hsu
Hung-yi Lee
44
16
0
15 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by
  Text-To-Speech Data Augmentation
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
A. Laptev
Roman Korostik
A. Svischev
A. Andrusenko
Ivan Medennikov
S. Rybin
81
61
0
14 May 2020
S2IGAN: Speech-to-Image Generation via Adversarial Learning
S2IGAN: Speech-to-Image Generation via Adversarial Learning
Xinsheng Wang
Tingting Qiao
Jihua Zhu
Alan Hanjalic
O. Scharenborg
VLMGAN
73
17
0
14 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for
  Text-to-Speech Synthesis
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle
Kevin J. Shih
R. Prenger
Bryan Catanzaro
96
121
0
12 May 2020
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN
Zewang Zhang
Qiao Tian
Heng Lu
Ling-Hao Chen
Shan Liu
62
27
0
12 May 2020
FeatherWave: An efficient high-fidelity neural vocoder with multi-band
  linear prediction
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction
Qiao Tian
Zewang Zhang
Heng Lu
Linghui Chen
Shan Liu
69
22
0
12 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
70
32
0
12 May 2020
TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian
  Portuguese
TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
Edresson Casanova
A. Júnior
C. Shulby
F. S. Oliveira
João Paulo Teixeira
M. Ponti
S. Aluísio
73
24
0
11 May 2020
Previous
123...202122...242526
Next