ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,038 papers shown
Title
TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving
TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving
Xuefeng Jiang
Yuan Ma
Pengxiang Li
Leimeng Xu
Xin Wen
Kun Zhan
Zhongpu Xia
Peng Jia
Xianpeng Lang
Sheng Sun
DiffM
13
0
0
14 May 2025
SingNet: Towards a Large-Scale, Diverse, and In-the-Wild Singing Voice Dataset
SingNet: Towards a Large-Scale, Diverse, and In-the-Wild Singing Voice Dataset
Yicheng Gu
Chaoren Wang
J. Zhang
Xueyao Zhang
Zihao Fang
Haorui He
Zhizheng Wu
18
2
0
14 May 2025
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
Bowen Zhang
Congchao Guo
Geng Yang
Hang Yu
H. M. Zhang
...
Yichen Xiao
Yiying Zhou
Y. Zhang
Yuan Lu
Yucen He
26
0
0
12 May 2025
Physics-informed Multiple-Input Operators for efficient dynamic response prediction of structures
Physics-informed Multiple-Input Operators for efficient dynamic response prediction of structures
Bilal Ahmed
Yuqing Qiu
Diab W. Abueidda
Waleed El-Sekelly
Tarek Abdoun
M. Mobasher
AI4CE
31
0
0
11 May 2025
Beyond Identity: A Generalizable Approach for Deepfake Audio Detection
Beyond Identity: A Generalizable Approach for Deepfake Audio Detection
Yasaman Ahmadiadli
Xiao-Ping Zhang
Naimul Khan
26
0
0
10 May 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Yiming Niu
Jinliang Deng
L. Zhang
Zimu Zhou
Yongxin Tong
AI4TS
26
0
0
09 May 2025
Aliasing Reduction in Neural Amp Modeling by Smoothing Activations
Aliasing Reduction in Neural Amp Modeling by Smoothing Activations
Ryota Sato
Julius O. Smith III
40
0
0
07 May 2025
Recognizing Ornaments in Vocal Indian Art Music with Active Annotation
Recognizing Ornaments in Vocal Indian Art Music with Active Annotation
Sumit Kumar
Parampreet Singh
Vipul Arora
31
0
0
07 May 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
39
0
0
01 May 2025
Do global forecasting models require frequent retraining?
Do global forecasting models require frequent retraining?
Marco Zanotti
37
0
0
01 May 2025
Temporal Attention Evolutional Graph Convolutional Network for Multivariate Time Series Forecasting
Temporal Attention Evolutional Graph Convolutional Network for Multivariate Time Series Forecasting
Xinlong Zhao
L. Zhang
Tianbo Zou
Yan Zhang
AI4TS
26
0
0
01 May 2025
Versatile Framework for Song Generation with Prompt-based Control
Versatile Framework for Song Generation with Prompt-based Control
Y. Zhang
Wenxiang Guo
Changhao Pan
Z. Zhu
Ruiqi Li
...
Rongjie Huang
Ruiyuan Zhang
Zhiqing Hong
Ziyue Jiang
Zhou Zhao
77
1
0
27 Apr 2025
Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms
Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms
Alireza Rafiei
Gari D. Clifford
N. Katebi
31
0
0
17 Apr 2025
Generation of Musical Timbres using a Text-Guided Diffusion Model
Generation of Musical Timbres using a Text-Guided Diffusion Model
Weixuan Yuan
Qadeer Khan
Vladimir Golkov
DiffM
26
0
0
12 Apr 2025
AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Yubing Cao
Yinfeng Yu
Yongming Li
Liejun Wang
19
0
0
12 Apr 2025
Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables
Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables
Slawek Smyl
Grzegorz Dudek
Paweł Pełka
AI4TS
21
1
0
11 Apr 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
24
0
0
08 Apr 2025
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
Artem Zholus
Carl Doersch
Yi Yang
Skanda Koppula
Viorica Patraucean
Xu He
Ignacio Rocco
Mehdi S. M. Sajjadi
Sarath Chandar
Ross Goroshin
30
0
0
08 Apr 2025
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
Yong Ren
Jiangyan Yi
Tao Wang
J. Tao
Zhengqi Wen
Chenxing Li
Z. Lian
Ruibo Fu
Ye Bai
Xiaohui Zhang
51
0
0
07 Apr 2025
SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation
SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation
Stephen Brade
Sam Anderson
Rithesh Kumar
Zeyu Jin
Anh Truong
36
0
0
07 Apr 2025
Diff-SSL-G-Comp: Towards a Large-Scale and Diverse Dataset for Virtual Analog Modeling
Diff-SSL-G-Comp: Towards a Large-Scale and Diverse Dataset for Virtual Analog Modeling
Yicheng Gu
Runsong Zhang
Lauri Juvela
Z. Wu
DiffM
124
0
0
06 Apr 2025
Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics
Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics
Jungpil Shin
Abu Saleh Musa Miah
Sota Konnai
Shu Hoshitaka
Pankoo Kim
29
0
0
04 Apr 2025
LiDAR-based Object Detection with Real-time Voice Specifications
LiDAR-based Object Detection with Real-time Voice Specifications
Anurag Kulkarni
24
0
0
03 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao W. Wang
Songruoyao Wu
Jiaxing Yu
K. Zhang
MGen
VGen
70
1
0
01 Apr 2025
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO
Giovanni Cioffi
L. Bauersfeld
Davide Scaramuzza
46
0
0
01 Apr 2025
Style Quantization for Data-Efficient GAN Training
Style Quantization for Data-Efficient GAN Training
Jian Wang
Xin Lan
Jizhe Zhou
Yuxin Tian
Jiancheng Lv
34
0
0
31 Mar 2025
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens
Shivam Mehta
Nebojsa Jojic
Hannes Gamper
31
0
0
28 Mar 2025
From Deep Learning to LLMs: A survey of AI in Quantitative Investment
From Deep Learning to LLMs: A survey of AI in Quantitative Investment
Bokai Cao
Saizhuo Wang
Xinyi Lin
Xiaojun Wu
Haohan Zhang
L. Ni
Jian Guo
AIFin
52
0
0
27 Mar 2025
Tune It Up: Music Genre Transfer and Prediction
Tune It Up: Music Genre Transfer and Prediction
Fidan Samet
Oguz Bakir
Adnan Fidan
27
0
0
27 Mar 2025
Debiasing Kernel-Based Generative Models
Debiasing Kernel-Based Generative Models
Tian Qin
Wei-Min Huang
48
0
0
26 Mar 2025
ReverBERT: A State Space Model for Efficient Text-Driven Speech Style Transfer
ReverBERT: A State Space Model for Efficient Text-Driven Speech Style Transfer
Michael Brown
Sofia Martinez
Priya Singh
45
0
0
26 Mar 2025
An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
Haotian Yang
Z. Wang
Benson Chou
Sophie Xu
Hao Wang
Jingxian Wang
Qizhen Zhang
FedML
90
0
0
26 Mar 2025
BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
Yuguang Li
Ivaylo Boyadzhiev
Zixuan Liu
Linda Shapiro
Alex Colburn
DiffM
3DV
65
0
0
25 Mar 2025
SparSamp: Efficient Provably Secure Steganography Based on Sparse Sampling
SparSamp: Efficient Provably Secure Steganography Based on Sparse Sampling
Yaofei Wang
Gang Pei
Kejiang Chen
Jinyang Ding
Chao Pan
Weilong Pang
Donghui Hu
W. Zhang
49
1
0
25 Mar 2025
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Tianze Luo
Xingchen Miao
Wenbo Duan
DiffM
37
0
0
20 Mar 2025
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Zhedong Zhang
Liang-Sheng Li
C. Yan
Chunshan Liu
A. Hengel
Yuankai Qi
83
2
0
15 Mar 2025
Designing Neural Synthesizers for Low-Latency Interaction
Designing Neural Synthesizers for Low-Latency Interaction
Franco Caspe
Jordie Shier
Mark Sandler
C. Saitis
Andrew Mcpherson
133
0
0
14 Mar 2025
Exploring Performance-Complexity Trade-Offs in Sound Event Detection
T. Morocutti
Florian Schmid
Jonathan Greif
Francesco Foscarin
Gerhard Widmer
38
0
0
14 Mar 2025
Chat-TS: Enhancing Multi-Modal Reasoning Over Time-Series and Natural Language Data
Paul Quinlan
Qingguo Li
Xiaodan Zhu
AI4TS
LRM
59
0
0
13 Mar 2025
Mamba-VA: A Mamba-based Approach for Continuous Emotion Recognition in Valence-Arousal Space
Yuheng Liang
Z. Wang
Feng Liu
Mingzhou Liu
Yu Yao
Mamba
60
1
0
13 Mar 2025
Learning Control of Neural Sound Effects Synthesis from Physically Inspired Models
Yisu Zong
Joshua Reiss
51
0
0
13 Mar 2025
Probabilistic Forecasting via Autoregressive Flow Matching
Ahmed El-Gazzar
Marcel van Gerven
AI4TS
43
0
0
13 Mar 2025
Multilevel Generative Samplers for Investigating Critical Phenomena
Ankur Singha
E. Cellini
K. Nicoli
K. Jansen
Stefan Kühn
Shinichi Nakajima
62
1
0
11 Mar 2025
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
59
0
0
11 Mar 2025
Generalized Interpolating Discrete Diffusion
Dimitri von Rutte
J. Fluri
Yuhui Ding
Antonio Orvieto
Bernhard Scholkopf
Thomas Hofmann
DiffM
62
0
0
06 Mar 2025
An Optimization Algorithm for Multimodal Data Alignment
Wei Zhang
X. Wang
Lan Yu
S. Li
49
0
0
05 Mar 2025
HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation
Hongye Cheng
Tianyu Wang
Guangsi Shi
Zexing Zhao
Yanwei Fu
SLR
45
1
0
03 Mar 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
58
0
0
03 Mar 2025
Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios
Mohammad Rafid Ul Islam
Prasad Tadepalli
Alan Fern
33
0
0
03 Mar 2025
Language-agnostic, automated assessment of listeners' speech recall using large language models
Björn Herrmann
24
0
0
02 Mar 2025
1234...596061
Next