Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.10433
Cited By
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
28 November 2017
Aaron van den Oord
Yazhe Li
Igor Babuschkin
Karen Simonyan
Oriol Vinyals
Koray Kavukcuoglu
George van den Driessche
Edward Lockhart
Luis C. Cobo
Florian Stimberg
Norman Casagrande
Dominik Grewe
Seb Noury
Sander Dieleman
Erich Elsen
Nal Kalchbrenner
Heiga Zen
Alex Graves
Helen King
T. Walters
Dan Belov
Demis Hassabis
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Parallel WaveNet: Fast High-Fidelity Speech Synthesis"
50 / 143 papers shown
Title
DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis
Zeeshan Ahmad
Shudi Bao
Meng Chen
15
0
0
14 May 2025
Differentially Private Parameter-Efficient Fine-tuning for Large ASR Models
Hongbin Liu
Lun Wang
Om Thakkar
Abhradeep Thakurta
Arun Narayanan
26
0
0
02 Oct 2024
MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior
Honghua Chen
Chen Change Loy
Xingang Pan
39
13
0
05 May 2024
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis
Shivam Mehta
Anna Deichler
Jim O'Regan
Birger Moëll
Jonas Beskow
G. Henter
Simon Alexanderson
38
4
0
30 Apr 2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
Keiichi Tokuda
DiffM
27
2
0
22 Feb 2024
Creating New Voices using Normalizing Flows
Piotr Bilinski
Thomas Merritt
Abdelhamid Ezzerg
Kamil Pokora
Sebastian Cygert
K. Yanagisawa
Roberto Barra-Chicote
Daniel Korzekwa
18
17
0
22 Dec 2023
SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation
Zhiyu Qu
Tao Xiang
Yi-Zhe Song
DiffM
34
11
0
27 Aug 2023
A Deep Active Contour Model for Delineating Glacier Calving Fronts
Konrad Heidler
Lichao Mou
Erik Loebel
M. Scheinert
Sébastien Lefèvre
Xiao Xiang Zhu
29
7
0
07 Jul 2023
Collaborative Score Distillation for Consistent Visual Synthesis
Subin Kim
Kyungmin Lee
June Suk Choi
Jongheon Jeong
Kihyuk Sohn
Jinwoo Shin
DiffM
24
21
0
04 Jul 2023
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Yujia Xiao
Shaofei Zhang
Xi Wang
Xuejiao Tan
Lei He
Sheng Zhao
Frank Soong
Tan Lee
17
5
0
03 Jul 2023
NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders
Livio Baldini Soares
D. Gillick
Jeremy R. Cole
Tom Kwiatkowski
24
1
0
23 May 2023
Inductive Simulation of Calorimeter Showers with Normalizing Flows
M. Buckley
Claudius Krause
Ian Pang
David Shih
AI4CE
11
22
0
19 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
13
13
0
13 May 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
19
1
0
09 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
12
1
0
26 Apr 2023
A Comprehensive Survey on Knowledge Distillation of Diffusion Models
Weijian Luo
DiffM
MedIm
52
33
0
09 Apr 2023
Deep Learning for Inertial Positioning: A Survey
Changhao Chen
Xianfei Pan
16
47
0
07 Mar 2023
Speaker-Aware Anti-Spoofing
Xuechen Liu
Md. Sahidullah
Kong Aik Lee
Tomi Kinnunen
19
3
0
02 Mar 2023
Hypernetworks build Implicit Neural Representations of Sounds
Filip Szatkowski
Karol J. Piczak
Przemtslaw Spurek
Jacek Tabor
Tomasz Trzciñski
22
11
0
09 Feb 2023
HyperNeRFGAN: Hypernetwork approach to 3D NeRF GAN
Adam Kania
Artur Kasymov
Maciej Ziȩba
P. Spurek
30
9
0
27 Jan 2023
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Ze Chen
Yihan Wu
Yichong Leng
Jiawei Chen
Haohe Liu
...
Ke Wang
Lei He
Sheng Zhao
Jiang Bian
Danilo P. Mandic
DiffM
22
22
0
30 Dec 2022
Extreme Audio Time Stretching Using Neural Synthesis
Leonardo Fierro
Alec Wright
Vesa Valimaki
Matti Hämäläinen
14
1
0
30 Nov 2022
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
Ajay Jain
Amber Xie
Pieter Abbeel
DiffM
27
89
0
21 Nov 2022
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
J. Webber
Cassia Valentini-Botinhao
Evelyn Williams
G. Henter
Simon King
11
9
0
13 Nov 2022
HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks
Filip Szatkowski
Karol J. Piczak
P. Spurek
Jacek Tabor
Tomasz Trzciñski
23
12
0
03 Nov 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
15
53
0
06 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
42
29
0
03 Oct 2022
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole
Ajay Jain
Jonathan T. Barron
B. Mildenhall
47
2,307
0
29 Sep 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
21
5
0
26 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
28
566
0
07 Sep 2022
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
Qiyang Li
Ajay Jain
Pieter Abbeel
OffRL
39
4
0
03 Aug 2022
Auto-regressive Image Synthesis with Integrated Quantization
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Kai Cui
Changgong Zhang
Shijian Lu
22
10
0
21 Jul 2022
Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai
L. A. Khaliq
Timon Ulrich
CVBM
60
0
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
23
27
0
27 Jun 2022
Short-Term Density Forecasting of Low-Voltage Load using Bernstein-Polynomial Normalizing Flows
M. Arpogaus
Marcus Voss
Beate Sick
Mark Nigge-Uricher
Oliver Durr
23
15
0
29 Apr 2022
Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Sangjun Park
Kihyun Choo
Joohyung Lee
A. Porov
Konstantin Osipov
June Sig Sung
9
6
0
27 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
J. Wang
Dan Su
Dong Yu
DiffM
29
92
0
25 Mar 2022
Improve few-shot voice cloning using multi-modal learning
Haitong Zhang
Yue Lin
13
8
0
18 Mar 2022
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
13
4
0
01 Mar 2022
Benchmarking Generative Latent Variable Models for Speech
Jakob Drachmann Havtorn
Lasse Borgholt
Søren Hauberg
J. Frellsen
Lars Maaløe
18
3
0
22 Feb 2022
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Adam Gabry's
Goeric Huybrechts
M. Ribeiro
C. Chien
Julian Roth
Giulia Comini
Roberto Barra-Chicote
Bartek Perz
Jaime Lorenzo-Trueba
25
21
0
16 Feb 2022
Distribution augmentation for low-resource expressive text-to-speech
Mateusz Lajszczak
Animesh Prasad
Arent van Korlaar
Bajibabu Bollepalli
A. Bonafonte
...
M. Nicolis
Alexis Moinet
Thomas Drugman
Trevor Wood
Elena Sokolova
19
7
0
13 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo P. Mandic
Lei He
Sheng Zhao
DiffM
18
28
0
08 Feb 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer
Xiaochun An
Frank Soong
Lei Xie
54
18
0
24 Jan 2022
Short Range Correlation Transformer for Occluded Person Re-Identification
Yunbin Zhao
Song-Chun Zhu
Dongsheng Wang
Zhiwei Liang
ViT
13
21
0
04 Jan 2022
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
25
98
0
20 Dec 2021
Perceptual Loss with Recognition Model for Single-Channel Enhancement and Robust ASR
Peter William VanHarn Plantinga
Deblin Bagchi
Eric Fosler-Lussier
44
10
0
11 Dec 2021
Resampling Base Distributions of Normalizing Flows
Vincent Stimper
Bernhard Schölkopf
José Miguel Hernández-Lobato
BDL
22
32
0
29 Oct 2021
CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing Flows
Claudius Krause
David Shih
31
64
0
21 Oct 2021
PixelPyramids: Exact Inference Models from Lossless Image Pyramids
Shweta Mahajan
Stefan Roth
TPM
10
2
0
17 Oct 2021
1
2
3
Next