ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01083
  4. Cited By
MelNet: A Generative Model for Audio in the Frequency Domain

MelNet: A Generative Model for Audio in the Frequency Domain

4 June 2019
Sean Vasquez
M. Lewis
    DiffM
ArXivPDFHTML

Papers citing "MelNet: A Generative Model for Audio in the Frequency Domain"

27 / 27 papers shown
Title
LoopGen: Training-Free Loopable Music Generation
LoopGen: Training-Free Loopable Music Generation
Davide Marincione
Giorgio Strano
Donato Crisostomi
Roberto Ribuoli
Emanuele Rodolà
MGen
60
0
0
06 Apr 2025
Learn to Sing by Listening: Building Controllable Virtual Singer by
  Unsupervised Learning from Voice Recordings
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
29
1
0
09 May 2023
Msanii: High Fidelity Music Synthesis on a Shoestring Budget
Msanii: High Fidelity Music Synthesis on a Shoestring Budget
Kinyugo Maina
19
5
0
16 Jan 2023
High Quality Audio Coding with MDCTNet
High Quality Audio Coding with MDCTNet
G. Davidson
M. Vinton
P. Ekstrand
Cong Zhou
Lars Villemoes
Lie Lu
MedIm
18
8
0
08 Dec 2022
GANStrument: Adversarial Instrument Sound Synthesis with Pitch-invariant
  Instance Conditioning
GANStrument: Adversarial Instrument Sound Synthesis with Pitch-invariant Instance Conditioning
Gaku Narita
Junichi Shimizu
Taketo Akama
GAN
21
11
0
10 Nov 2022
A Survey on Artificial Intelligence for Music Generation: Agents,
  Domains and Perspectives
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives
Carlos Hernandez-Olivan
Javier Hernandez-Olivan
J. R. Beltrán
MGen
40
6
0
25 Oct 2022
Controllable Data Generation by Deep Learning: A Review
Controllable Data Generation by Deep Learning: A Review
Shiyu Wang
Yuanqi Du
Xiaojie Guo
Bo Pan
Zhaohui Qin
Liang Zhao
31
28
0
19 Jul 2022
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Kyle Kastner
Aaron Courville
32
0
0
30 Jun 2022
cMelGAN: An Efficient Conditional Generative Model Based on Mel
  Spectrograms
cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms
Tracy Qian
Jackson Kaunismaa
Tony Chung
MGen
GAN
MedIm
19
5
0
15 May 2022
Dual Learning Music Composition and Dance Choreography
Dual Learning Music Composition and Dance Choreography
Shuang Wu
Zhenguang Liu
Shijian Lu
Li Cheng
16
8
0
28 Jan 2022
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey
Zahra Khanjani
Gabrielle Watson
V. P Janeja
25
25
0
28 Nov 2021
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End
  Speech Synthesis
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Alexandra Vioni
Myrsini Christidou
Nikolaos Ellinas
G. Vamvoukakis
Panos Kakoulidis
Taehoon Kim
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
11
11
0
19 Nov 2021
Controllable deep melody generation via hierarchical music structure
  representation
Controllable deep melody generation via hierarchical music structure representation
Shuqi Dai
Zeyu Jin
Celso Gomes
Roger B. Dannenberg
MGen
22
51
0
02 Sep 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio
  Synthesis with GANs
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs
J. Nistal
Stefan Lattner
G. Richard
21
8
0
03 Aug 2021
Musical Speech: A Transformer-based Composition Tool
Musical Speech: A Transformer-based Composition Tool
Jason dÉon
Sri Harsha Dumpala
Chandramouli Shama Sastry
Daniel Oore
Sageev Oore
18
1
0
02 Aug 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based
  Synchronous Sound Generation in Silent Videos
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos
Sanchita Ghose
John J. Prevost
GAN
19
26
0
20 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
21
24
0
20 Apr 2021
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
34
1,389
0
21 Sep 2020
Conditional Image Generation with One-Vs-All Classifier
Conditional Image Generation with One-Vs-All Classifier
Xiangrui Xu
Yaqin Li
Cao Yuan
VLM
GAN
25
12
0
18 Sep 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Da-Yi Wu
Yi-Hsuan Yang
GAN
6
8
0
28 May 2020
Unconditional Audio Generation with Generative Adversarial Networks and
  Cycle Regularization
Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization
Jen-Yu Liu
Yu-Hua Chen
Yin-Cheng Yeh
Yi-Hsuan Yang
GAN
32
35
0
18 May 2020
A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News
  Anchors
A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors
Ruobing Zheng
Zhou Zhu
Bo Song
Changjiang Ji
3DH
19
2
0
20 Feb 2020
Phase reconstruction based on recurrent phase unwrapping with deep
  neural networks
Phase reconstruction based on recurrent phase unwrapping with deep neural networks
Yoshiki Masuyama
Kohei Yatabe
Yuma Koizumi
Yasuhiro Oikawa
N. Harada
13
21
0
14 Feb 2020
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
223
239
0
25 Sep 2019
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
224
2,234
0
14 Jun 2018
Pixel Recurrent Neural Networks
Pixel Recurrent Neural Networks
Aaron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
SSeg
GAN
233
2,547
0
25 Jan 2016
1