Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.06873
Cited By
FastPitch: Parallel Text-to-speech with Pitch Prediction
11 June 2020
Adrian Lañcucki
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FastPitch: Parallel Text-to-speech with Pitch Prediction"
50 / 173 papers shown
Title
OverFlow: Putting flows on top of neural transducers for better TTS
Shivam Mehta
Ambika Kirkland
Harm Lameris
Jonas Beskow
Éva Székely
G. Henter
AI4TS
26
12
0
13 Nov 2022
Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers
Cheng-Ping Hsieh
Subhankar Ghosh
Boris Ginsburg
41
18
0
01 Nov 2022
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Nobuyuki Morioka
Heiga Zen
Nanxin Chen
Yu Zhang
Yifan Ding
29
16
0
28 Oct 2022
Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
25
1
0
25 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
30
22
0
21 Oct 2022
Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion
Yuta Matsunaga
Takaaki Saeki
Shinnosuke Takamichi
Hiroshi Saruwatari
9
1
0
18 Oct 2022
Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario
Emily R. Bartusiak
Edward J. Delp
19
12
0
14 Oct 2022
Controllable Accented Text-to-Speech Synthesis
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
21
6
0
22 Sep 2022
Detecting Synthetic Speech Manipulation in Real Audio Recordings
M. Rahman
M. Graciarena
Diego Castán
Chris Cobo-Kroenke
Mitchell McLaren
A. Lawson
AAML
17
9
0
15 Sep 2022
Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features
Jun Xue
Cunhang Fan
Zhao Lv
J. Tao
Jiangyan Yi
C. Zheng
Zhengqi Wen
Minmin Yuan
S. Shao
15
30
0
02 Aug 2022
Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Giulia Comini
Goeric Huybrechts
M. Ribeiro
Adam Gabry's
Jaime Lorenzo-Trueba
19
5
0
29 Jul 2022
PoeticTTS -- Controllable Poetry Reading for Literary Studies
Julia Koch
Florian Lux
Nadja Schauffler
T. Bernhart
Felix Dieterle
Jonas Kuhn
Sandra Richter
Gabriel Viehhauser
Ngoc Thang Vu
20
5
0
11 Jul 2022
Speaker Anonymization with Phonetic Intermediate Representations
Sarina Meyer
Florian Lux
Pavel Denisov
Julia Koch
Pascal Tilli
Ngoc Thang Vu
21
27
0
11 Jul 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre
Guangyan Zhang
Ying Qin
W. Zhang
Jialun Wu
Mei Li
Yu Gai
Feijun Jiang
Tan Lee
48
26
0
29 Jun 2022
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Dacheng Yin
Chuanxin Tang
Yanqing Liu
Xiaoqiang Wang
Zhiyuan Zhao
Yucheng Zhao
Zhiwei Xiong
Sheng Zhao
Chong Luo
16
12
0
28 Jun 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Florian Lux
Julia Koch
Ngoc Thang Vu
32
19
0
24 Jun 2022
Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
Tae-Woo Kim
Minguk Kang
Gyeong-Hoon Lee
AAML
9
6
0
23 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura
Yuki Saito
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
AI4TS
11
7
0
16 Jun 2022
FlexLip: A Controllable Text-to-Lip System
Dan Oneaţă
Beáta Lőrincz
Adriana Stan
H. Cucu
14
3
0
07 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
33
38
0
30 May 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
29
24
0
20 May 2022
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech
Y. Li
Cheng Yu
Guangzhi Sun
Hua Jiang
Fanglei Sun
Weiqin Zu
Ying Wen
Yang Yang
Jun Wang
9
7
0
09 May 2022
A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond
Yisheng Xiao
Lijun Wu
Junliang Guo
Juntao Li
M. Zhang
Tao Qin
Tie-Yan Liu
3DV
MedIm
AI4CE
30
82
0
20 Apr 2022
Enhancement of Pitch Controllability using Timbre-Preserving Pitch Augmentation in FastPitch
Hanbin Bae
Young-Sun Joo
14
2
0
12 Apr 2022
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-Speech
Jaesung Bae
Jinhyeok Yang
Taejun Bak
Young-Sun Joo
DiffM
11
6
0
08 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
18
0
0
08 Apr 2022
Heterogeneous Target Speech Separation
Hyunjae Cho
Wonbin Jung
Junhyeok Lee
Paris Smaragdis
Sanghyun Woo
46
26
0
07 Apr 2022
Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Rodolfo Zevallos
19
4
0
01 Apr 2022
Data-augmented cross-lingual synthesis in a teacher-student framework
M. D. Korte
Jaebok Kim
A. Kunikoshi
Adaeze Adigwe
E. Klabbers
16
0
0
31 Mar 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
10
30
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
9
51
0
31 Mar 2022
Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Sunghwan Ahn
Joun Yeop Lee
N. Kim
31
25
0
29 Mar 2022
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Yuki Saito
Yuto Nishimura
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
11
12
0
28 Mar 2022
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Zhiyong Wu
Shiyin Kang
H. Meng
18
12
0
23 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
15
29
0
07 Mar 2022
Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows
Kevin J. Shih
Rafael Valle
Rohan Badlani
J. F. Santos
Bryan Catanzaro
20
4
0
03 Mar 2022
Revisiting Over-Smoothness in Text to Speech
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
65
61
0
26 Feb 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
26
13
0
04 Jan 2022
VRAIN-UPV MLLP's system for the Blizzard Challenge 2021
A. P. D. Martos
A. Sanchís
Alfons Juan-Císcar
6
6
0
29 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
42
60
0
15 Oct 2021
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech
Haoyue Zhan
Xinyuan Yu
Haitong Zhang
Yang Zhang
Yue Lin
16
5
0
14 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning
Paarth Neekhara
Jason Chun Lok Li
Boris Ginsburg
22
15
0
12 Oct 2021
EdiTTS: Score-based Editing for Controllable Text-to-Speech
Jaesung Tae
Hyeongju Kim
Taesu Kim
DiffM
171
39
0
06 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Yi Ren
Jinglin Liu
Zhou Zhao
42
78
0
30 Sep 2021
Text-Free Prosody-Aware Generative Spoken Language Modeling
Eugene Kharitonov
Ann Lee
Adam Polyak
Yossi Adi
Jade Copet
...
Tu Nguyen
M. Rivière
Abdel-rahman Mohamed
Emmanuel Dupoux
Wei-Ning Hsu
30
116
0
07 Sep 2021
One TTS Alignment To Rule Them All
Rohan Badlani
A. Lancucki
Kevin J. Shih
Rafael Valle
Wei Ping
Bryan Catanzaro
19
82
0
23 Aug 2021
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis
Julian Zaïdi
Hugo Seuté
Benjamin van Niekerk
M. Carbonneau
18
20
0
04 Aug 2021
Creation and Detection of German Voice Deepfakes
Vanessa Barnekow
Dominik Binder
Niclas Kromrey
Pascal Munaretto
A. Schaad
Felix Schmieder
16
1
0
02 Aug 2021
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
Joanna Rownicka
Kilian Sprenkamp
A. Tripiana
Volodymyr Gromoglasov
Timo P. Kunz
19
0
0
21 Jul 2021
Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm
Elijah Gutierrez
Pilar Oplustil Gallegos
Catherine Lai
13
3
0
06 Jul 2021
Previous
1
2
3
4
Next