Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.11129
Cited By
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
22 May 2020
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search"
50 / 286 papers shown
Title
Expressive, Variable, and Controllable Duration Modelling in TTS
Ammar Abbas
Thomas Merritt
Alexis Moinet
S. Karlapati
Ewa Muszyñska
Simon Slangen
Elia Gatti
Thomas Drugman
22
10
0
28 Jun 2022
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
S. Karlapati
Penny Karanasou
Mateusz Lajszczak
Ammar Abbas
Alexis Moinet
Peter Makarov
Raymond Li
Arent van Korlaar
Simon Slangen
Thomas Drugman
14
15
0
27 Jun 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Florian Lux
Julia Koch
Ngoc Thang Vu
32
19
0
24 Jun 2022
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Kentaro Mitsui
Tianyu Zhao
Kei Sawada
Yukiya Hono
Yoshihiko Nankaku
K. Tokuda
20
14
0
24 Jun 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
50
525
0
13 Jun 2022
Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models
Alon Levkovitch
Eliya Nachmani
Lior Wolf
DiffM
19
29
0
05 Jun 2022
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Ziyue Jiang
Zhe Su
Zhou Zhao
Qian Yang
Yi Ren
Jinglin Liu
Zhe Ye
24
4
0
05 Jun 2022
Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish
A. Oktem
Rodolfo Zevallos
Yasmin Moslem
Günes Öztürk
Karen Sarhon
18
0
0
31 May 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
33
38
0
30 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
196
52
0
30 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
115
34
0
15 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
38
211
0
09 May 2022
Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss
Efthymios Georgiou
Kosmas Kritsis
Georgios Paraskevopoulos
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
16
3
0
28 Apr 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
Zhenhui Ye
Zhou Zhao
Yi Ren
Fei Wu
21
27
0
25 Apr 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Rongjie Huang
Max W. Y. Lam
J. Wang
Dan Su
Dong Yu
Yi Ren
Zhou Zhao
DiffM
28
165
0
21 Apr 2022
A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond
Yisheng Xiao
Lijun Wu
Junliang Guo
Juntao Li
M. Zhang
Tao Qin
Tie-Yan Liu
3DV
MedIm
AI4CE
30
82
0
20 Apr 2022
Music Source Separation with Generative Flow
Ge Zhu
Jordan Darefsky
Fei Jiang
A. Selitskiy
Z. Duan
15
6
0
19 Apr 2022
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-Speech
Jaesung Bae
Jinhyeok Yang
Taejun Bak
Young-Sun Joo
DiffM
16
6
0
08 Apr 2022
Heterogeneous Target Speech Separation
Hyunjae Cho
Wonbin Jung
Junhyeok Lee
Paris Smaragdis
Sanghyun Woo
46
26
0
07 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
14
0
0
05 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis
Fan Wang
Po-Chun Hsu
Da-Rong Liu
Hung-yi Lee
13
0
0
01 Apr 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
10
30
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
14
51
0
31 Mar 2022
Learn2Sing 2.0: Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher
Heyang Xue
Xinsheng Wang
Yongmao Zhang
Lei Xie
Pengcheng Zhu
Mengxiao Bi
DiffM
22
11
0
30 Mar 2022
Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
Andros Tjandra
S. Sakti
VLM
6
3
0
29 Mar 2022
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion
Edresson Casanova
C. Shulby
Alexander Korolev
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
21
11
0
29 Mar 2022
Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Sunghwan Ahn
Joun Yeop Lee
N. Kim
36
25
0
29 Mar 2022
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis
Rishabh Jain
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
17
14
0
22 Mar 2022
AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling
Bac Nguyen
Fabien Cardinaux
Stefan Uhlich
14
2
0
21 Mar 2022
Text-free non-parallel many-to-many voice conversion using normalising flows
Thomas Merritt
Abdelhamid Ezzerg
Piotr Bilinski
Magdalena Proszewska
Kamil Pokora
Roberto Barra-Chicote
Daniel Korzekwa
28
14
0
15 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
20
29
0
07 Mar 2022
Variational Auto-Encoder based Mandarin Speech Cloning
Qingyu Xing
Xiaohan Ma
13
0
0
06 Mar 2022
Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows
Kevin J. Shih
Rafael Valle
Rohan Badlani
J. F. Santos
Bryan Catanzaro
25
4
0
03 Mar 2022
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS
Haohan Guo
Hui Lu
Xixin Wu
H. Meng
94
7
0
02 Mar 2022
Revisiting Over-Smoothness in Text to Speech
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
65
61
0
26 Feb 2022
ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech
Yi Ren
Ming Lei
Zhiying Huang
Shi-Rui Zhang
Qian Chen
Zhijie Yan
Zhou Zhao
32
41
0
16 Feb 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Songxiang Liu
Dan Su
Dong Yu
DiffM
68
65
0
28 Jan 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
179
378
0
04 Dec 2021
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Heeseung Kim
Sungwon Kim
Sungroh Yoon
DiffM
BDL
19
107
0
23 Nov 2021
Personalized One-Shot Lipreading for an ALS Patient
Bipasha Sen
Aditya Agarwal
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
LM&MA
6
3
0
02 Nov 2021
VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis
Yongmao Zhang
Jian Cong
Heyang Xue
Lei Xie
Pengcheng Zhu
Mengxiao Bi
19
73
0
17 Oct 2021
Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor
Anchit Gupta
Faizan Farooq Khan
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
CVBM
22
6
0
16 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffM
VGen
36
39
0
15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
Rongjie Huang
Chenye Cui
Feiyang Chen
Yi Ren
Jinglin Liu
Zhou Zhao
Baoxing Huai
N. Yuan
GAN
99
62
0
14 Oct 2021
FedSpeech: Federated Text-to-Speech with Continual Learning
Ziyue Jiang
Yi Ren
Ming Lei
Zhou Zhao
FedML
93
24
0
14 Oct 2021
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech
Haoyue Zhan
Xinyuan Yu
Haitong Zhang
Yang Zhang
Yue Lin
16
5
0
14 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
Jen-Hao Rick Chang
A. Shrivastava
H. Koppula
Xiaoshuai Zhang
Oncel Tuzel
DiffM
51
16
0
06 Oct 2021
EdiTTS: Score-based Editing for Controllable Text-to-Speech
Jaesung Tae
Hyeongju Kim
Taesu Kim
DiffM
173
39
0
06 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Yi Ren
Jinglin Liu
Zhou Zhao
42
78
0
30 Sep 2021
AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
Jongyoon Song
Sungwon Kim
Sungroh Yoon
66
37
0
14 Sep 2021
Previous
1
2
3
4
5
6
Next