WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,038 papers shown

Title
Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation Reo Yoneyama Atsushi Miyashita Ryuichi Yamamoto T. Toda 27 1 0 11 Nov 2024
Few-Shot Task Learning through Inverse Generative Modeling Aviv Netanyahu Yilun Du Antonia Bronars Jyothish Pari J. Tenenbaum Tianmin Shu Pulkit Agrawal 46 1 0 07 Nov 2024
Multivariate Data Augmentation for Predictive Maintenance using Diffusion Andrew Thompson Alexander Sommers Alicia Russell-Gilbert Logan Cummins Sudip Mittal Shahram Rahimi Maria Seale Joseph Jaboure Thomas Arnold Joshua Church DiffM 37 0 0 06 Nov 2024
Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis Mohammad Zbeeb Mohammad Ghorayeb Mariam Salman 31 0 0 04 Nov 2024
CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality Yiqin Zhao Mallesham Dasari Tian Guo 48 0 0 04 Nov 2024
Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations Quoc-Huy Trinh Minh-Van Nguyen Trong-Hieu Nguyen-Mau Khoa Tran Thanh Do 33 0 0 03 Nov 2024
Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis Shijia Liao Y. Wang Tianyu Li Yifan Cheng Ruoyi Zhang Rongzhi Zhou Yijin Xing AuLLM 35 10 0 02 Nov 2024
Stochastic Reconstruction of Gappy Lagrangian Turbulent Signals by Conditional Diffusion Models Tianyi Li Luca Biferale F. Bonaccorso M. Buzzicotti Luca Centurioni DiffM 43 3 0 31 Oct 2024
RDSinger: Reference-based Diffusion Network for Singing Voice Synthesis Kehan Sui Jinxu Xiang Fang Jin DiffM 22 0 0 29 Oct 2024
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models Weijian Luo C. Zhang Debing Zhang Zhengyang Geng 28 3 0 28 Oct 2024
Scaling-based Data Augmentation for Generative Models and its Theoretical Extension Yoshitaka Koike Takumi Nakagawa Hiroki Waida Takafumi Kanamori DiffM 30 0 0 28 Oct 2024
Meta-Learning Approaches for Improving Detection of Unseen Speech Deepfakes Ivan Kukanov Janne Laakkonen Tomi Kinnunen Ville Hautamaki AAML 31 0 0 27 Oct 2024
Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series Ilan Naiman Nimrod Berman Itai Pemper Idan Arbiv Gal Fadlon Omri Azencot 32 11 0 25 Oct 2024
Flow Generator Matching Zemin Huang Zhengyang Geng Weijian Luo Guo-jun Qi 35 8 0 25 Oct 2024
Making Social Platforms Accessible: Emotion-Aware Speech Generation with Integrated Text Analysis Suparna De Ionut Bostan Nishanth Sastry 32 0 0 24 Oct 2024
TEAM: Topological Evolution-aware Framework for Traffic Forecasting--Extended Version Duc Kieu Tung Kieu Peng Han Bin Yang Christian S. Jensen Bac Le AI4TS 24 1 0 24 Oct 2024
Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences Weijian Luo EGVM 36 6 0 24 Oct 2024
ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams Srija Anand Praveen Srinivasa Varadhan Mehak Singal Mitesh M. Khapra 23 0 0 23 Oct 2024
Regularized autoregressive modeling and its application to audio signal declipping Ondřej Mokrý P. Rajmic 19 0 0 23 Oct 2024
Deep Generative Models for 3D Medical Image Synthesis Paul Friedrich Yannik Frisch P. Cattin 3DV MedIm 37 3 0 23 Oct 2024
One-Step Diffusion Distillation through Score Implicit Matching Weijian Luo Zemin Huang Zhengyang Geng J. Zico Kolter Guo-jun Qi DiffM 32 13 0 22 Oct 2024
Real-time Sub-milliwatt Epilepsy Detection Implemented on a Spiking Neural Network Edge Inference Processor Ruixin Lia Guoxu Zhaoa Dylan Richard Muir Yuya Ling Karla Burelo Mina Khoei Dong Wang Y. Xing Ning Qiao 13 2 0 22 Oct 2024
Acoustic Model Optimization over Multiple Data Sources: Merging and Valuation Victor Junqiu Wei Weicheng Wang Di Jiang Conghui Tan Rongzhong Lian MoMe 30 0 0 21 Oct 2024
ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps Yulin Song Guorui Sang Jing Yu Chuangbai Xiao DiffM 37 0 0 20 Oct 2024
Optimal Transport Maps are Good Voice Converters Arip Asadulaev Rostislav Korst V. Shutov Alexander Korotin Yaroslav Grebnyak Vahe Egiazarian E. Burnaev OT 32 1 0 17 Oct 2024
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding Tan Dat Nguyen Ji-Hoon Kim Jeongsoo Choi Shukjae Choi Jinseok Park Younglo Lee Joon Son Chung 26 0 0 17 Oct 2024
DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech J. Melechovský Ambuj Mehrish Berrak Sisman Dorien Herremans 16 1 0 17 Oct 2024
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis Yu Gu Qiushi Zhu Guangzhi Lei Chao Weng Dan Su DiffM 37 0 0 17 Oct 2024
TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering A. Habib Kesheng Wang Mary-Anne Hartley Gianfranco Doretto Donald Adjeroh LMTD 28 1 0 17 Oct 2024
Irregularity-Informed Time Series Analysis: Adaptive Modelling of Spatial and Temporal Dynamics L. Zheng Zhengyang Li Chang Dong W. Zhang Lin Yue Miao Xu Olaf Maennel Weitong Chen AI4TS 24 1 0 16 Oct 2024
Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations M. Germán-Morales A. J. Rivera-Rivas M. J. del Jesus Díaz C. J. Carmona AI4TS AI4CE 51 0 0 15 Oct 2024
Generative Deep Learning and Signal Processing for Data Augmentation of Cardiac Auscultation Signals: Improving Model Robustness Using Synthetic Audio Leigh Abbott Milan Marocchi Matthew Fynn Yue Rong Sven Nordholm MedIm 20 0 0 14 Oct 2024
Bahasa Harmony: A Comprehensive Dataset for Bahasa Text-to-Speech Synthesis with Discrete Codec Modeling of EnGen-TTS Onkar Kishor Susladkar Vishesh Tripathi Biddwan Ahmed 21 0 0 09 Oct 2024
Evaluating the Generalization Ability of Spatiotemporal Model in Urban Scenario Hongjun Wang Jiyuan Chen Tong Pan Zheng Dong Lingyu Zhang Renhe Jiang Xuan Song OOD 76 2 0 07 Oct 2024
Neural Fourier Modelling: A Highly Compact Approach to Time-Series Analysis Minjung Kim Yusuke Hioka Michael Witbrock AI4TS 26 0 0 07 Oct 2024
EmoGene: Audio-Driven Emotional 3D Talking-Head Generation Wenqing Wang Yun Fu VGen 76 0 0 07 Oct 2024
GAS-Norm: Score-Driven Adaptive Normalization for Non-Stationary Time Series Forecasting in Deep Learning Edoardo Urettini Daniele Atzeni Reshawn J. Ramjattan Antonio Carta AI4TS 19 0 0 04 Oct 2024
S7: Selective and Simplified State Space Layers for Sequence Modeling Taylan Soydan Nikola Zubić Nico Messikommer Siddhartha Mishra Davide Scaramuzza 35 4 0 04 Oct 2024
Generative Semantic Communication for Text-to-Speech Synthesis Jiahao Zheng Jinke Ren Peng Xu Zhihao Yuan Jie Xu Fangxin Wang Gui Gui Shuguang Cui 26 2 0 04 Oct 2024
Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition Olga Iakovenko Ivan Bondarenko 22 0 0 03 Oct 2024
Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting Marcel Kollovieh Marten Lienen David Ludke Leo Schwinn Stephan Günnemann AI4TS BDL DiffM 36 3 0 03 Oct 2024
On the Geometry and Optimization of Polynomial Convolutional Networks V. Shahverdi G. Marchetti Kathlén Kohn 23 2 0 01 Oct 2024
Recent Advances in Speech Language Models: A Survey Wenqian Cui Dianzhi Yu Xiaoqi Jiao Ziqiao Meng Guangyan Zhang Qichao Wang Yiwen Guo Irwin King AuLLM 59 14 0 01 Oct 2024
TSI: A Multi-View Representation Learning Approach for Time Series Forecasting Wentao Gao Ziqi Xu Jiuyong Li Lin Liu Jixue Liu T. Le Debo Cheng Yanchang Zhao Yun Chen AI4TS 20 0 0 30 Sep 2024
A method for identifying causality in the response of nonlinear dynamical systems Joseph Massingham Ole Nielsen Tore Butlin CML 13 0 0 26 Sep 2024
A Survey of Spatio-Temporal EEG data Analysis: from Models to Applications Pengfei Wang Huanran Zheng Silong Dai Yiqiao Wang Xiaotian Gu Yuanbin Wu Xiaoling Wang SyDa AI4TS 45 3 0 26 Sep 2024
Neural Coordination and Capacity Control for Inventory Management Carson Eisenach Udaya Ghai Dhruv Madeka Kari Torkkola Dean Phillips Foster Sham Kakade 18 0 0 24 Sep 2024
HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters Lauri Juvela Pablo Pérez Zarazaga G. Henter Zofia Malisz 27 0 0 23 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection Lam Pham Phat Lam Dat Tran Hieu Tang Tin Nguyen Alexander Schindler Canh Vu Alexander Polonsky Canh Vu 46 3 0 23 Sep 2024
ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning Jimeng Shi Azam Shirali Giri Narasimhan AI4TS 41 0 0 21 Sep 2024