ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.02446
  4. Cited By
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

6 May 2021
Jinglin Liu
Chengxi Li
Yi Ren
Feiyang Chen
Zhou Zhao
    DiffM
ArXivPDFHTML

Papers citing "DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism"

50 / 159 papers shown
Title
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Yongqi Wang
Ruofan Hu
Rongjie Huang
Zhiqing Hong
Ruiqi Li
Wenrui Liu
Fuming You
Tao Jin
Zhou Zhao
38
11
0
18 Mar 2024
CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing
CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing
Yin Li
Rajalakshmi Nandakumar
22
0
0
16 Mar 2024
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
  Diffusion Models
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Zeqian Ju
Yuancheng Wang
Kai Shen
Xu Tan
Detai Xin
...
Shikun Zhang
Jiang Bian
Lei He
Jinyu Li
Sheng Zhao
DiffM
28
143
0
05 Mar 2024
Authors' Values and Attitudes Towards AI-bridged Scalable
  Personalization of Creative Language Arts
Authors' Values and Attitudes Towards AI-bridged Scalable Personalization of Creative Language Arts
Taewook Kim
Hyomin Han
Eytan Adar
Matthew Kay
John Joon Young Chung
AI4CE
23
15
0
01 Mar 2024
SongComposer: A Large Language Model for Lyric and Melody Composition in
  Song Generation
SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation
Shuangrui Ding
Zihan Liu
Xiao-wen Dong
Pan Zhang
Rui Qian
Conghui He
Dahua Lin
Jiaqi Wang
14
23
0
27 Feb 2024
SingVisio: Visual Analytics of Diffusion Model for Singing Voice
  Conversion
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Liumeng Xue
Chaoren Wang
Mingxuan Wang
Xueyao Zhang
Jun Han
Zhizheng Wu
DiffM
24
5
0
20 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up
  Speech Diffusion Model
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
11
2
0
16 Feb 2024
SALAD: Smart AI Language Assistant Daily
SALAD: Smart AI Language Assistant Daily
Ragib Amin Nihal
Dong Huu Quoc Tran
Zirui Lin
Yimimg Xu
Haoran Liu
Zhaoyi An
Ma Kyou
11
0
0
12 Feb 2024
Diff-RNTraj: A Structure-aware Diffusion Model for Road
  Network-constrained Trajectory Generation
Diff-RNTraj: A Structure-aware Diffusion Model for Road Network-constrained Trajectory Generation
Tonglong Wei
Youfang Lin
S. Guo
Yan Lin
Yiheng Huang
Chenyang Xiang
Yuqing Bai
Menglu Ya
Huaiyu Wan
25
11
0
12 Feb 2024
Low-Resource Cross-Domain Singing Voice Synthesis via Reduced
  Self-Supervised Speech Representations
Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
Panos Kakoulidis
Nikolaos Ellinas
G. Vamvoukakis
Myrsini Christidou
Alexandra Vioni
...
Junkwang Oh
Gunu Jho
Inchul Hwang
Pirros Tsiakoulis
Aimilios Chalamandaris
13
1
0
02 Feb 2024
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and
  ACE-KiSing
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing
Jiatong Shi
Yueqian Lin
Xinyi Bai
Keyi Zhang
Yuning Wu
Yuxun Tang
Yifeng Yu
Qin Jin
Shinji Watanabe
25
6
0
31 Jan 2024
Combined Generative and Predictive Modeling for Speech Super-resolution
Combined Generative and Predictive Modeling for Speech Super-resolution
Heming Wang
Eric W. Healy
DeLiang Wang
DiffM
17
0
0
25 Jan 2024
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
Tan Dat Nguyen
Ji-Hoon Kim
Youngjoon Jang
Jaehun Kim
Joon Son Chung
DiffM
16
5
0
18 Jan 2024
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing
  Scenes
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes
Genghao Zhang
Yuxi Wang
Chuanchen Luo
Shibiao Xu
Zhaoxiang Zhang
Man Zhang
Junran Peng
VGen
3DV
25
4
0
07 Jan 2024
SAiD: Speech-driven Blendshape Facial Animation with Diffusion
SAiD: Speech-driven Blendshape Facial Animation with Diffusion
Inkyu Park
Jaewoong Cho
26
4
0
25 Dec 2023
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
24
26
0
15 Dec 2023
One-Step Diffusion Distillation via Deep Equilibrium Models
One-Step Diffusion Distillation via Deep Equilibrium Models
Zhengyang Geng
Ashwini Pokle
Trevor Killeen
26
28
0
12 Dec 2023
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Zehua Chen
Guande He
Kaiwen Zheng
Xu Tan
Jun Zhu
DiffM
44
21
0
06 Dec 2023
Multi-Scale Sub-Band Constant-Q Transform Discriminator for
  High-Fidelity Vocoder
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Zhizheng Wu
16
11
0
25 Nov 2023
The Missing U for Efficient Diffusion Models
The Missing U for Efficient Diffusion Models
Sergio Calvo-Ordoñez
Chun-Wun Cheng
Jiahao Huang
Lipei Zhang
Guang Yang
Carola-Bibiane Schonlieb
Angelica I Aviles-Rivero
DiffM
25
4
0
31 Oct 2023
DPP-TTS: Diversifying prosodic features of speech via determinantal
  point processes
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes
Seongho Joo
Hyukhun Koh
Kyomin Jung
DiffM
34
0
0
23 Oct 2023
DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model
  Statistics
DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics
Kaiwen Zheng
Cheng Lu
Jianfei Chen
Jun Zhu
DiffM
24
72
0
20 Oct 2023
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture
  Propagation
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation
Bangbang Yang
Wenqi Dong
Lin Ma
Wenbo Hu
Xiao Liu
Zhaopeng Cui
Yuewen Ma
DiffM
27
16
0
19 Oct 2023
A Comparative Study of Voice Conversion Models with Large-Scale Speech
  and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge
  2023
A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023
Ryuichi Yamamoto
Reo Yoneyama
Lester Phillip Violeta
Wen-Chin Huang
T. Toda
10
7
0
08 Oct 2023
VoiceExtender: Short-utterance Text-independent Speaker Verification
  with Guided Diffusion Model
VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model
Yayun He
Zuheng Kang
Jianzong Wang
Junqing Peng
Jing Xiao
DiffM
14
2
0
07 Oct 2023
DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Jiarui Hai
Helin Wang
Dongchao Yang
Karan Thakkar
Najim Dehak
Mounya Elhilali
DiffM
8
7
0
06 Oct 2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBM
AuLLM
22
114
0
01 Oct 2023
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Wenhao Guan
Qi Su
Haodong Zhou
Shiyu Miao
Xingjia Xie
Lin Li
Q. Hong
DiffM
13
13
0
29 Sep 2023
BiSinger: Bilingual Singing Voice Synthesis
BiSinger: Bilingual Singing Voice Synthesis
Huali Zhou
Yueqian Lin
Yao Shi
Peng Sun
Ming Li
16
5
0
25 Sep 2023
DurIAN-E: Duration Informed Attention Network For Expressive
  Text-to-Speech Synthesis
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis
Yu Gu
Yianrao Bian
Guangzhi Lei
Chao Weng
Dan Su
DiffM
6
2
0
22 Sep 2023
Electrolaryngeal Speech Intelligibility Enhancement Through Robust
  Linguistic Encoders
Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Lester Phillip Violeta
Wen-Chin Huang
D. Ma
Ryuichi Yamamoto
Kazuhiro Kobayashi
T. Toda
14
3
0
18 Sep 2023
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise
  Filter and Inverse Short Time Fourier Transform
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
Yinghao Aaron Li
Cong Han
Xilin Jiang
N. Mesgarani
30
4
0
18 Sep 2023
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech
  Using Natural Language Descriptions
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Reo Shimizu
Ryuichi Yamamoto
Masaya Kawamura
Yuma Shirahata
Hironori Doi
Tatsuya Komatsu
Kentaro Tachibana
DiffM
16
19
0
15 Sep 2023
SingFake: Singing Voice Deepfake Detection
SingFake: Singing Voice Deepfake Detection
Yongyi Zang
You Zhang
Mojtaba Heydari
Zhiyao Duan
17
29
0
14 Sep 2023
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Yiwei Guo
Chenpeng Du
Ziyang Ma
Xie Chen
K. Yu
DiffM
23
36
0
10 Sep 2023
FSD: An Initial Chinese Dataset for Fake Song Detection
FSD: An Initial Chinese Dataset for Fake Song Detection
Yuankun Xie
Jingjing Zhou
Xiaolin Lu
Zhenghao Jiang
Yuxin Yang
Haonan Cheng
Long Ye
19
14
0
05 Sep 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for
  Text-to-Speech -- A Study between English and Mandarin
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
17
8
0
02 Sep 2023
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Jing Chen
Xingcheng Song
Zhendong Peng
Binbin Zhang
Fuping Pan
Zhiyong Wu
DiffM
8
16
0
31 Aug 2023
A Review of Differentiable Digital Signal Processing for Music & Speech
  Synthesis
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
21
21
0
29 Aug 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent
  Videos
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
13
5
0
29 Aug 2023
Voice Conversion with Denoising Diffusion Probabilistic GAN Models
Voice Conversion with Denoising Diffusion Probabilistic GAN Models
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DiffM
11
5
0
28 Aug 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
J. Liu
62
31
0
27 Aug 2023
Audio Generation with Multiple Conditional Diffusion Model
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo
Jianguo Mao
Ruijie Tao
Long Yan
Kazushige Ouchi
Hong Liu
Xiangdong Wang
DiffM
19
11
0
23 Aug 2023
Elucidate Gender Fairness in Singing Voice Transcription
Elucidate Gender Fairness in Singing Voice Transcription
Xiangming Gu
Weizhen Zeng
Ye Wang
10
3
0
05 Aug 2023
A Systematic Exploration of Joint-training for Singing Voice Synthesis
A Systematic Exploration of Joint-training for Singing Voice Synthesis
Yuning Wu
Yifeng Yu
Jiatong Shi
Tao Qian
Qin Jin
38
5
0
05 Aug 2023
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive
  Speech Synthesis with Prosody Conditional Adversarial Training
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
H. Oh
Sang-Hoon Lee
Seong-Whan Lee
DiffM
11
14
0
31 Jul 2023
DisCover: Disentangled Music Representation Learning for Cover Song
  Identification
DisCover: Disentangled Music Representation Learning for Cover Song Identification
Jiahao Xun
Shengyu Zhang
Yanting Yang
Jieming Zhu
Liqun Deng
Zhou Zhao
Zhenhua Dong
Ruiqi Li
Lichao Zhang
Fei Wu
AAML
DRL
15
5
0
19 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
14
25
0
07 Jul 2023
Singing Voice Synthesis Using Differentiable LPC and
  Glottal-Flow-Inspired Wavetables
Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables
Chin-Yun Yu
Gyorgy Fazekas
20
7
0
29 Jun 2023
UniCATS: A Unified Context-Aware Text-to-Speech Framework with
  Contextual VQ-Diffusion and Vocoding
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
Chenpeng Du
Yiwei Guo
Feiyu Shen
Zhijun Liu
Zheng Liang
Xie Chen
Shuai Wang
Hui Zhang
K. Yu
DiffM
10
41
0
13 Jun 2023
Previous
1234
Next