ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,038 papers shown
Title
Synthetic Trajectory Generation Through Convolutional Neural Networks
Synthetic Trajectory Generation Through Convolutional Neural Networks
Jesse Merhi
Erik Buchholz
S. Kanhere
23
0
0
24 Jul 2024
Interval Forecasts for Gas Prices in the Face of Structural Breaks --
  Statistical Models vs. Neural Networks
Interval Forecasts for Gas Prices in the Face of Structural Breaks -- Statistical Models vs. Neural Networks
Stephan Schlüter
Sven Pappert
Martin Neumann
16
0
0
23 Jul 2024
QueST: Self-Supervised Skill Abstractions for Learning Continuous
  Control
QueST: Self-Supervised Skill Abstractions for Learning Continuous Control
Atharva Mete
Haotian Xue
Albert Wilcox
Yongxin Chen
Animesh Garg
SSL
32
16
0
22 Jul 2024
Generating Sample-Based Musical Instruments Using Neural Audio Codec
  Language Models
Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models
S. Nercessian
Johannes Imort
Ninon Devis
Frederik Blang
36
1
0
22 Jul 2024
DSP-informed bandwidth extension using locally-conditioned excitation
  and linear time-varying filter subnetworks
DSP-informed bandwidth extension using locally-conditioned excitation and linear time-varying filter subnetworks
S. Nercessian
Alexey Lukin
Johannes Imort
40
0
0
22 Jul 2024
PASTA: Controllable Part-Aware Shape Generation with Autoregressive
  Transformers
PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers
Songlin Li
Despoina Paschalidou
Leonidas J. Guibas
45
2
0
18 Jul 2024
Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation
  Systems
Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems
Daniel Platnick
Bishoy Abdelnour
Eamon Earl
Rahul Kumar
Zahra Rezaei
Thomas Tsangaris
Faraj Lagum
26
0
0
18 Jul 2024
Temporal receptive field in dynamic graph learning: A comprehensive
  analysis
Temporal receptive field in dynamic graph learning: A comprehensive analysis
Yannis Karmim
Leshanshui Yang
Raphael Fournier SÑiehotta
Clément Chatelain
Sébastien Adam
Nicolas Thome
30
1
0
17 Jul 2024
Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision
  Models
Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models
Rining Wu
Feixiang Zhou
Ziwei Yin
Jian K. Liu
33
0
0
15 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced
  Synchronicity
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serra
DiffM
VGen
44
13
0
15 Jul 2024
Exploring State Space and Reasoning by Elimination in Tsetlin Machines
Exploring State Space and Reasoning by Elimination in Tsetlin Machines
A. K. Kadhim
Ole-Christoffer Granmo
Lei Jiao
R. Shafik
38
2
0
12 Jul 2024
Autoregressive Speech Synthesis without Vector Quantization
Autoregressive Speech Synthesis without Vector Quantization
Lingwei Meng
Long Zhou
Shujie Liu
Sanyuan Chen
Bing Han
...
Jinyu Li
Sheng Zhao
Xixin Wu
Helen Meng
Furu Wei
48
30
0
11 Jul 2024
Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion
Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion
Yu Cao
Shaogang Gong
DiffM
32
2
0
09 Jul 2024
The Tug-of-War Between Deepfake Generation and Detection
The Tug-of-War Between Deepfake Generation and Detection
Hannah Lee
Changyeon Lee
Kevin Farhat
Lin Qiu
Steve Geluso
Aerin Kim
O. Etzioni
34
1
0
08 Jul 2024
Differentiable Modal Synthesis for Physical Modeling of Planar String
  Sound and Motion Simulation
Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation
J. Lee
Jaehyun Park
Min Jun Choi
Kyogu Lee
32
2
0
07 Jul 2024
SiamTST: A Novel Representation Learning Framework for Enhanced
  Multivariate Time Series Forecasting applied to Telco Networks
SiamTST: A Novel Representation Learning Framework for Enhanced Multivariate Time Series Forecasting applied to Telco Networks
S. Kristoffersen
Peter Skaar Nordby
Sara Malacarne
Massimiliano Ruocco
Pablo Ortiz
AI4TS
20
0
0
02 Jul 2024
FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech
  Synthesis
FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis
Yinlin Guo
Yening Lv
Jinqiao Dou
Yan Zhang
Yuehai Wang
18
0
0
30 Jun 2024
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Ivan Villa-Renteria
Mason L. Wang
Zachary Shah
Zhe Li
Soohyun Kim
Neelesh Ramachandran
Mert Pilanci
36
0
0
27 Jun 2024
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling
  on Time Variability
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability
Hyun Joon Park
Jin Sob Kim
Wooseok Shin
Sung Won Han
DiffM
33
2
0
27 Jun 2024
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for
  Efficient Audio Synthesis and Beyond
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
Marco Comunità
Zhi-Wei Zhong
Akira Takahashi
Shiqi Yang
Mengjie Zhao
Koichi Saito
Yukara Ikemiya
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
63
2
0
25 Jun 2024
Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video
  Generation
Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video Generation
Rafael Redondo
37
0
0
23 Jun 2024
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation
  Using GANs and Integrated Unaligned Clean Data
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data
Yu-Hua Chen
Woosung Choi
Wei-Hsiang Liao
Marco A. Martínez Ramírez
K. Cheuk
Yuki Mitsufuji
J. Jang
Yi-Hsuan Yang
50
5
0
22 Jun 2024
Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models
Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models
Paul Henderson
Melonie de Almeida
D. Ivanova
Titas Anciukevicius
3DGS
43
4
0
18 Jun 2024
Traffic Prediction considering Multiple Levels of Spatial-temporal
  Information: A Multi-scale Graph Wavelet-based Approach
Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach
Zilin Bian
Jingqin Gao
K. Ozbay
Zhenning Li
22
0
0
18 Jun 2024
TutteNet: Injective 3D Deformations by Composition of 2D Mesh
  Deformations
TutteNet: Injective 3D Deformations by Composition of 2D Mesh Deformations
Bo Sun
Thibault Groueix
Chen Song
Qixing Huang
Noam Aigerman
30
0
0
17 Jun 2024
MusicScore: A Dataset for Music Score Modeling and Generation
MusicScore: A Dataset for Music Score Modeling and Generation
Yuheng Lin
Zheqi Dai
Qiuqiang Kong
VLM
37
2
0
17 Jun 2024
SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field
SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field
Yuhang He
Shitong Xu
Jia-Xing Zhong
Sangyun Shin
Niki Trigoni
Andrew Markham
30
0
0
16 Jun 2024
Period Singer: Integrating Periodic and Aperiodic Variational
  Autoencoders for Natural-Sounding End-to-End Singing Voice Synthesis
Period Singer: Integrating Periodic and Aperiodic Variational Autoencoders for Natural-Sounding End-to-End Singing Voice Synthesis
Taewoo Kim
Choongsang Cho
Young Han Lee
AI4TS
33
0
0
14 Jun 2024
Toward Fully-End-to-End Listened Speech Decoding from EEG Signals
Toward Fully-End-to-End Listened Speech Decoding from EEG Signals
Jihwan Lee
Aditya Kommineni
Tiantian Feng
Kleanthis Avramidis
Xuan Shi
Sudarsana Reddy Kadiri
Shrikanth Narayanan
31
0
0
12 Jun 2024
Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion
  Models
Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion Models
J. Nistal
Marco Pasini
Cyran Aouameur
M. Grachten
Stefan Lattner
DiffM
43
16
0
12 Jun 2024
Invariant multiscale neural networks for data-scarce scientific
  applications
Invariant multiscale neural networks for data-scarce scientific applications
I. Schurov
D. Alforov
M. Katsnelson
A. Bagrov
A. Itin
AI4CE
34
0
0
12 Jun 2024
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual
  Text-to-Speech
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Ashishkumar Gudmalwar
Nirmesh Shah
Sai Akarsh
Pankaj Wasnik
R. Shah
32
1
0
12 Jun 2024
Instant 3D Human Avatar Generation using Image Diffusion Models
Instant 3D Human Avatar Generation using Image Diffusion Models
Nikos Kolotouros
Thiemo Alldieck
Enric Corona
Eduard Gabriel Bazavan
C. Sminchisescu
42
7
0
11 Jun 2024
Visual Representation Learning with Stochastic Frame Prediction
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang
Dongyoung Kim
Junsu Kim
Jinwoo Shin
Pieter Abbeel
Younggyo Seo
36
2
0
11 Jun 2024
ICGAN: An implicit conditioning method for interpretable feature control
  of neural audio synthesis
ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis
Yunyi Liu
Craig Jin
32
0
0
11 Jun 2024
Data Augmentation for Multivariate Time Series Classification: An
  Experimental Study
Data Augmentation for Multivariate Time Series Classification: An Experimental Study
Romain Ilbert
Thai V. Hoang
Zonghua Zhang
30
0
0
10 Jun 2024
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
Hyunjae Cho
Junhyeok Lee
Wonbin Jung
16
0
0
10 Jun 2024
MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing
  Voice Synthesis via Classifier-free Diffusion Guidance
MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance
Semin Kim
Myeonghun Jeong
Hyeonseung Lee
Minchan Kim
Byoung Jin Choi
Nam Soo Kim
VLM
DiffM
45
1
0
10 Jun 2024
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion
Bingsong Bai
Fengping Wang
Yingming Gao
Ya Li
46
0
0
09 Jun 2024
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice
  Conversion with Singer Guidance
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance
Shihao Chen
Yu Gu
Jie Zhang
Na Li
Rilin Chen
Liping Chen
Lirong Dai
DiffM
40
6
0
08 Jun 2024
Optimizing Time Series Forecasting Architectures: A Hierarchical Neural
  Architecture Search Approach
Optimizing Time Series Forecasting Architectures: A Hierarchical Neural Architecture Search Approach
Difan Deng
Marius Lindauer
AI4TS
53
0
0
07 Jun 2024
CLoG: Benchmarking Continual Learning of Image Generation Models
CLoG: Benchmarking Continual Learning of Image Generation Models
Haotian Zhang
Junting Zhou
Haowei Lin
Hang Ye
Jianhua Zhu
Zihao Wang
Liangcai Gao
Yizhou Wang
Yitao Liang
DiffM
VLM
29
0
0
07 Jun 2024
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Rishit Dagli
Shivesh Prakash
Robert Wu
H. Khosravani
31
3
0
06 Jun 2024
Learning 1D Causal Visual Representation with De-focus Attention
  Networks
Learning 1D Causal Visual Representation with De-focus Attention Networks
Chenxin Tao
Xizhou Zhu
Shiqian Su
Lewei Lu
Changyao Tian
...
Gao Huang
Hongsheng Li
Yu Qiao
Jie Zhou
Jifeng Dai
70
1
0
06 Jun 2024
Style Mixture of Experts for Expressive Text-To-Speech Synthesis
Style Mixture of Experts for Expressive Text-To-Speech Synthesis
Ahad Jawaid
Shreeram Suresh Chandra
Junchen Lu
Berrak Sisman
MoE
37
0
0
05 Jun 2024
A Survey of Transformer Enabled Time Series Synthesis
A Survey of Transformer Enabled Time Series Synthesis
Alexander Sommers
Logan Cummins
Sudip Mittal
Shahram Rahimi
Maria Seale
Joseph Jaboure
Thomas Arnold
AI4TS
37
2
0
04 Jun 2024
An Independence-promoting Loss for Music Generation with Language Models
An Independence-promoting Loss for Music Generation with Language Models
Jean-Marie Lemercier
Simon Rouard
Jade Copet
Yossi Adi
Alexandre Défossez
20
1
0
04 Jun 2024
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction
  and Waveform Generation
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
Hui-Peng Du
Ye-Xin Lu
Yang Ai
Zhen-Hua Ling
35
3
0
04 Jun 2024
Robust Multi-Modal Speech In-Painting: A Sequence-to-Sequence Approach
Robust Multi-Modal Speech In-Painting: A Sequence-to-Sequence Approach
Mahsa Kadkhodaei Elyaderani
Shahram Shirani
26
0
0
02 Jun 2024
Creative Text-to-Audio Generation via Synthesizer Programming
Creative Text-to-Audio Generation via Synthesizer Programming
Manuel Cherep
Nikhil Singh
Jessica Shand
23
3
0
01 Jun 2024
Previous
123456...596061
Next