Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00672
Cited By
v1
v2
v3 (latest)
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
3 June 2019
Mutian He
Yan Deng
Lei He
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS"
33 / 33 papers shown
Title
Singing Voice Synthesis Based on a Musical Note Position-Aware Attention Mechanism
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
K. Tokuda
62
2
0
28 Dec 2022
OverFlow: Putting flows on top of neural transducers for better TTS
Shivam Mehta
Ambika Kirkland
Harm Lameris
Jonas Beskow
Éva Székely
G. Henter
AI4TS
107
13
0
13 Nov 2022
Structured State Space Decoder for Speech Recognition and Synthesis
Koichi Miyazaki
Masato Murata
Tomoki Koriyama
104
13
0
31 Oct 2022
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Haohan Guo
Fenglong Xie
Frank Soong
Xixin Wu
Helen M. Meng
78
12
0
22 Sep 2022
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention
Artem Gorodetskii
Ivan Ozhiganov
115
2
0
25 Jan 2022
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech
Michael Hassid
Michelle Tadmor Ramanovich
Brendan Shillingford
Miaosen Wang
Ye Jia
Tal Remez
DiffM
72
18
0
19 Nov 2021
A study on the efficacy of model pre-training in developing neural text-to-speech system
Guangyan Zhang
Yichong Leng
Daxin Tan
Ying Qin
Kaitao Song
Xu Tan
Sheng Zhao
Tan Lee
58
2
0
08 Oct 2021
On-device neural speech synthesis
Sivanand Achanta
Albert Antony
L. Golipour
Jiangchuan Li
T. Raitio
...
Francesco Rossi
Jennifer Shi
Jaimin Upadhyay
David Winarsky
Hepeng Zhang
108
17
0
17 Sep 2021
Neural HMMs are all you need (for high-quality attention-free TTS)
Shivam Mehta
Éva Székely
Jonas Beskow
G. Henter
102
18
0
30 Aug 2021
Combining speakers of multiple languages to improve quality of neural voices
Javier Latorre
Charlotte Bailleul
Tuuli H. Morrill
Alistair Conkie
Y. Stylianou
64
8
0
17 Aug 2021
Enhancing audio quality for expressive Neural Text-to-Speech
Abdelhamid Ezzerg
Adam Gabry's
Bartosz Putrycz
Daniel Korzekwa
Daniel Sáez-Trigueros
David McHardy
Kamil Pokora
Jakub Lachowicz
Jaime Lorenzo-Trueba
V. Klimkov
130
6
0
13 Aug 2021
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
105
73
0
19 Jul 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Ammar Abbas
Bajibabu Bollepalli
Alexis Moinet
Arnaud Joly
Penny Karanasou
Peter Makarov
Simon Slangens
S. Karlapati
Thomas Drugman
67
0
0
29 Jun 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
133
359
0
29 Jun 2021
Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech
Raahil Shah
Kamil Pokora
Abdelhamid Ezzerg
V. Klimkov
Goeric Huybrechts
Bartosz Putrycz
Daniel Korzekwa
Thomas Merritt
64
26
0
24 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
94
25
0
20 Apr 2021
Exploring Machine Speech Chain for Domain Adaptation and Few-Shot Speaker Adaptation
Fengpeng Yue
Yan Deng
Lei He
Tom Ko
70
8
0
08 Apr 2021
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition
Hirofumi Inaguma
Tatsuya Kawahara
125
14
0
28 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Peng Liu
Yuewen Cao
Songxiang Liu
Na Hu
Guangzhi Li
Chao Weng
Jane Polak Scowcroft
95
22
0
12 Feb 2021
s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis
Xi Wang
Huaiping Ming
Lei He
Frank Soong
43
5
0
17 Nov 2020
FeatherTTS: Robust and Efficient attention based Neural TTS
Qiao Tian
Zewang Zhang
Chao-Jung Liu
Heng Lu
Linghui Chen
Bin Wei
P. He
Shan Liu
69
4
0
02 Nov 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Ye Jia
Ron J. Weiss
Yonghui Wu
DRL
76
103
0
22 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
106
112
0
08 Oct 2020
Controllable neural text-to-speech synthesis using intuitive prosodic features
T. Raitio
Ramya Rasipuram
D. Castellani
78
66
0
14 Sep 2020
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS
Rui Liu
Berrak Sisman
F. Bao
Guanglai Gao
Haizhou Li
41
18
0
11 Aug 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
112
73
0
04 Aug 2020
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian Chen
Xu Tan
Yi Ren
Jin Xu
Hao Sun
Sheng Zhao
Tao Qin
Tie-Yan Liu
65
110
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
85
187
0
05 Jun 2020
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
76
31
0
20 May 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding
Seungwoo Choi
Seungju Han
Dongyoung Kim
S. Ha
91
67
0
18 May 2020
Teacher-Student Training for Robust Tacotron-based TTS
Rui Liu
Berrak Sisman
Jingdong Li
F. Bao
Guanglai Gao
Haizhou Li
109
38
0
07 Nov 2019
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
25
2
0
28 Oct 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
106
114
0
23 Oct 2019
1