Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.06711
Cited By
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
8 October 2019
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis"
50 / 226 papers shown
Title
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
24
5
0
01 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
38
24
0
20 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
37
8
0
19 May 2022
Macedonian Speech Synthesis for Assistive Technology Applications
B. Sofronievski
Elena Velovska
Martin Velichkovski
Violeta Argirova
Tea Veljkovikj
...
Kristijan Lazarev
Toni Bachvarovski
Z. Ivanovski
Dimitar Tashkovski
B. Gerazov
16
0
0
18 May 2022
cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms
Tracy Qian
Jackson Kaunismaa
Tony Chung
MGen
GAN
MedIm
21
5
0
15 May 2022
The ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts
Alice Baird
Panagiotis Tzirakis
Gauthier Gidel
Marco Jiralerspong
Eilif B. Muller
Kory W. Mathewson
Björn Schuller
Min Zhang
D. Keltner
Alan S. Cowen
VLM
36
30
0
03 May 2022
How does a spontaneously speaking conversational agent affect user behavior?
Takahisa Iizuka
H. Mori
13
2
0
02 May 2022
Time Domain Adversarial Voice Conversion for ADD 2022
Cheng Wen
Tingwei Guo
Xi Tan
Rui Yan
Shuran Zhou
Chuandong Xie
Wei Zou
Xiangang Li
23
4
0
19 Apr 2022
VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration
Haohe Liu
Xubo Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
21
51
0
12 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
30
0
0
08 Apr 2022
FFC-SE: Fast Fourier Convolution for Speech Enhancement
Ivan Shchekotov
Pavel Andreev
Oleg Ivanov
Aibek Alanov
Dmitry Vetrov
40
23
0
06 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
20
32
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
30
51
0
31 Mar 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
43
45
0
31 Mar 2022
Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Sangjun Park
Kihyun Choo
Joohyung Lee
A. Porov
Konstantin Osipov
June Sig Sung
24
6
0
27 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
DiffM
41
92
0
25 Mar 2022
HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement
Pavel Andreev
Aibek Alanov
Oleg Ivanov
Dmitry Vetrov
38
38
0
24 Mar 2022
Practical cognitive speech compression
Reza Lotfidereshgi
P. Gournay
35
2
0
08 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
33
29
0
07 Mar 2022
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
34
4
0
01 Mar 2022
SpeechPainter: Text-conditioned Speech Inpainting
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
16
26
0
15 Feb 2022
Unsupervised word-level prosody tagging for controllable speech synthesis
Yiwei Guo
Chenpeng Du
Kai Yu
26
15
0
15 Feb 2022
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
21
56
0
14 Feb 2022
Deep Performer: Score-to-Audio Music Performance Synthesis
Hao-Wen Dong
Cong Zhou
Taylor Berg-Kirkpatrick
Julian McAuley
27
17
0
12 Feb 2022
Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality
Daniel Watson
William Chan
Jonathan Ho
Mohammad Norouzi
DiffM
BDL
43
179
0
11 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
33
28
0
08 Feb 2022
PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech
Srikanth Korse
N. Pia
Kishan Gupta
Guillaume Fuchs
57
14
0
31 Jan 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Songxiang Liu
Dan Su
Dong Yu
DiffM
75
65
0
28 Jan 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Haohan Guo
Zhiping Zhou
Fanbo Meng
Kai-Chun Liu
59
16
0
25 Jan 2022
A sinusoidal signal reconstruction method for the inversion of the mel-spectrogram
Anastasia Natsiou
Seán O'Leary
25
3
0
07 Jan 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
44
13
0
04 Jan 2022
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
36
100
0
20 Dec 2021
Soundify: Matching Sound Effects to Video
David Chuan-En Lin
Anastasis Germanidis
Cristobal Valenzuela
Yining Shi
Nikolas Martelaro
30
16
0
17 Dec 2021
VocBench: A Neural Vocoder Benchmark for Speech Synthesis
Ehab A. AlBadawy
Andrew Gibiansky
Qing He
Jilong Wu
Ming-Ching Chang
Siwei Lyu
27
12
0
06 Dec 2021
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey
Zahra Khanjani
Gabrielle Watson
V. P Janeja
25
25
0
28 Nov 2021
V2C: Visual Voice Cloning
Qi Chen
Yuanqing Li
Yuankai Qi
Jiaqiu Zhou
Mingkui Tan
Qi Wu
VGen
33
24
0
25 Nov 2021
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang
Chyi-Jiunn Lin
Da-Rong Liu
Yi-Chen Chen
Hung-yi Lee
22
56
0
07 Nov 2021
Emotional Prosody Control for Speech Generation
S. Sivaprasad
Saiteja Kosgi
Vineet Gandhi
12
17
0
07 Nov 2021
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines
Haozhe Zhang
Zexin Cai
Xiaoyi Qin
Ming Li
54
15
0
06 Nov 2021
Hybrid Spectrogram and Waveform Source Separation
Alexandre Défossez
24
162
0
05 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
132
125
0
04 Nov 2021
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
32
122
0
17 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffM
VGen
36
39
0
15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
55
60
0
15 Oct 2021
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Haitong Zhang
Yue Lin
26
0
0
14 Oct 2021
Discovery of Single Independent Latent Variable
Uri Shaham
Jonathan Svirsky
Ori Katz
Ronen Talmon
CML
30
2
0
12 Oct 2021
Source Mixing and Separation Robust Audio Steganography
Naoya Takahashi
M. Singh
Yuki Mitsufuji
34
6
0
11 Oct 2021
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms
Chien-Feng Liao
Jen-Yu Liu
Yi-Hsuan Yang
27
5
0
08 Oct 2021
Cloning one's voice using very limited data in the wild
Dongyang Dai
Yuan-Jui Chen
Li Chen
Ming Tu
Lu Liu
Rui Xia
Qiao Tian
Yuping Wang
Yuxuan Wang
SyDa
33
9
0
07 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Axel Roebel
F. Bous
29
2
0
07 Oct 2021
Previous
1
2
3
4
5
Next