ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.16749
  4. Cited By
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
  Adaptive Noise Spectral Shaping

SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping

31 March 2022
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
    DiffM
ArXivPDFHTML

Papers citing "SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping"

36 / 36 papers shown
Title
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Tianze Luo
Xingchen Miao
Wenbo Duan
DiffM
37
0
0
20 Mar 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu
Sang-gil Lee
Chao-Han Huck Yang
Yuan Gong
Yu-Chun Wang
James Glass
Rafael Valle
Bryan Catanzaro
SSL
44
0
0
02 Mar 2025
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
Ching Hua Lee
Chouchang Yang
Jaejin Cho
Yashas Malur Saidutta
R. S. Srinivasa
Yilin Shen
Hongxia Jin
DiffM
80
0
0
19 Feb 2025
Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution
  and Harmonic Prior for Reliable Complex Spectrogram Estimation
Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation
Reo Yoneyama
Atsushi Miyashita
Ryuichi Yamamoto
T. Toda
22
1
0
11 Nov 2024
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
40
2
0
16 Oct 2024
Facial Expression-Enhanced TTS: Combining Face Representation and
  Emotion Intensity for Adaptive Speech
Facial Expression-Enhanced TTS: Combining Face Representation and Emotion Intensity for Adaptive Speech
Yunji Chu
Yunseob Shim
Unsang Park
18
0
0
24 Sep 2024
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style
  Temporal Modeling in Text-to-Speech
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Xin Qi
Ruibo Fu
Zhengqi Wen
Tao Wang
Chunyu Qiang
...
Xiaopeng Wang
Yuankun Xie
Yukun Liu
Xuefei Liu
Guanjun Li
DiffM
20
0
0
18 Sep 2024
Accelerating High-Fidelity Waveform Generation via Adversarial Flow
  Matching Optimization
Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
AI4TS
27
1
0
15 Aug 2024
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform
  Generation
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
OOD
DiffM
AI4TS
40
5
0
14 Aug 2024
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
Yuanjun Lv
Hai Li
Ying Yan
Junhui Liu
Danming Xie
Lei Xie
38
1
0
12 Jun 2024
Detecting Out-Of-Distribution Earth Observation Images with Diffusion
  Models
Detecting Out-Of-Distribution Earth Observation Images with Diffusion Models
Georges Le Bellier
Nicolas Audebert
27
4
0
19 Apr 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
18
2
0
08 Mar 2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a
  Diffusion Probabilistic Model
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
Keiichi Tokuda
DiffM
27
2
0
22 Feb 2024
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
Haocheng Liu
Teysir Baoueb
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
24
4
0
09 Feb 2024
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and
  Music Synthesis
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
Teysir Baoueb
Haocheng Liu
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
DiffM
6
5
0
30 Jan 2024
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
Tan Dat Nguyen
Ji-Hoon Kim
Youngjoon Jang
Jaehun Kim
Joon Son Chung
DiffM
16
5
0
18 Jan 2024
Generative Pre-training for Speech with Flow Matching
Generative Pre-training for Speech with Flow Matching
Alexander H. Liu
Matt Le
Apoorv Vyas
Bowen Shi
Andros Tjandra
Wei-Ning Hsu
19
31
0
25 Oct 2023
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial
  Network
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
Takashi Shibuya
Yuhta Takida
Yuki Mitsufuji
11
11
0
06 Sep 2023
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Sang-Hoon Lee
Haram Choi
H. Oh
Seong-Whan Lee
BDL
23
9
0
30 Jul 2023
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion
  and Adversarial Training with Large Speech Language Models
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Yinghao Aaron Li
Cong Han
Vinay S. Raghavan
Gavin Mischler
N. Mesgarani
VLM
DiffM
28
107
0
13 Jun 2023
DiffSketching: Sketch Control Image Synthesis with Diffusion Models
DiffSketching: Sketch Control Image Synthesis with Diffusion Models
Qiang Wang
Di Kong
Fengyin Lin
Yonggang Qi
DiffM
21
13
0
30 May 2023
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net
  Encoder With Multiple STFTs
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Won Jang
D. Lim
Heayoung Park
19
1
0
18 May 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by
  Unsupervised Learning from Voice Recordings
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
13
1
0
09 May 2023
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
Senmao Li
Joost van de Weijer
Taihang Hu
F. Khan
Qibin Hou
Yaxing Wang
Jian Yang
DiffM
29
52
0
28 Mar 2023
A Survey on Audio Diffusion Models: Text To Speech Synthesis and
  Enhancement in Generative AI
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI
Chenshuang Zhang
Chaoning Zhang
Sheng Zheng
Mengchun Zhang
Maryam Qamar
Sung-Ho Bae
In So Kweon
DiffM
MedIm
39
64
0
23 Mar 2023
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised
  Speech and Text Representations
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Yuma Koizumi
Heiga Zen
Shigeki Karita
Yifan Ding
Kohei Yatabe
Nobuyuki Morioka
Yu Zhang
Wei Han
Ankur Bapna
M. Bacchiani
23
22
0
03 Mar 2023
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
Jiyoung Lee
Joon Son Chung
Soo-Whan Chung
DiffM
19
27
0
27 Feb 2023
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to
  Speech
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Ze Chen
Yihan Wu
Yichong Leng
Jiawei Chen
Haohe Liu
...
Ke Wang
Lei He
Sheng Zhao
Jiang Bian
Danilo P. Mandic
DiffM
20
22
0
30 Dec 2022
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with
  Very Low Computational Complexity
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with Very Low Computational Complexity
Ahmed Mustafa
J. Valin
Jan Büthe
Paris Smaragdis
Mike Goodwin
12
4
0
08 Dec 2022
HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with
  Discrete and Continuous Denoising
HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising
M. Shabani
Sepidehsadat Hosseini
Yasutaka Furukawa
DiffM
21
57
0
23 Nov 2022
Diffusion-based Generative Speech Source Separation
Diffusion-based Generative Speech Source Separation
Robin Scheibler
Youna Ji
Soo-Whan Chung
J. Byun
Soyeon Choe
Min-Seok Choi
DiffM
14
38
0
31 Oct 2022
Robust One-Shot Singing Voice Conversion
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
15
8
0
20 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
13
16
0
14 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on
  Fixed-Point Iteration
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
42
29
0
03 Oct 2022
A Survey on Generative Diffusion Model
A Survey on Generative Diffusion Model
Hanqun Cao
Cheng Tan
Zhangyang Gao
Yilun Xu
Guangyong Chen
Pheng-Ann Heng
Stan Z. Li
MedIm
37
205
0
06 Sep 2022
Speech Enhancement and Dereverberation with Diffusion-based Generative
  Models
Speech Enhancement and Dereverberation with Diffusion-based Generative Models
Julius Richter
Simon Welker
Jean-Marie Lemercier
Bunlong Lay
Timo Gerkmann
DiffM
11
178
0
11 Aug 2022
1