ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.16083
  4. Cited By
UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

Interspeech (Interspeech), 2023
28 June 2023
Heeseung Kim
Sungwon Kim
Ji-Ran Yeom
Sung-Wan Yoon
    DiffM
ArXiv (abs)PDFHTML

Papers citing "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"

17 / 17 papers shown
Entropy-based Coarse and Compressed Semantic Speech Representation Learning
Entropy-based Coarse and Compressed Semantic Speech Representation Learning
Jialong Zuo
Guangyan Zhang
Minghui Fang
Shengpeng Ji
Xiaoqi Jiao
Jingyu Li
Yiwen Guo
Zhou Zhao
102
0
0
30 Aug 2025
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow MatchingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Jialong Zuo
Shengpeng Ji
Minghui Fang
Mingze Li
Ziyue Jiang
Xize Cheng
Xiaoda Yang
Chen Feiyang
Xinyu Duan
Zhou Zhao
222
0
0
01 Jun 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
351
3
0
01 May 2025
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching ModelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Jialong Zuo
Shengpeng Ji
Minghui Fang
Ziyue Jiang
Xize Cheng
...
Wenrui Liu
Guangyan Zhang
Zehai Tu
Yiwen Guo
Zhou Zhao
388
8
0
08 Feb 2025
Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody Prompting
Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody PromptingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Wooseok Han
Minki Kang
Changhun Kim
Eunho Yang
163
3
0
31 Dec 2024
Analytic Study of Text-Free Speech Synthesis for Raw Audio using a
  Self-Supervised Learning Model
Analytic Study of Text-Free Speech Synthesis for Raw Audio using a Self-Supervised Learning ModelAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2024
Joonyong Park
Daisuke Saito
Nobuaki Minematsu
273
0
0
04 Dec 2024
NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple
  Speakers
NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple SpeakersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Nohil Park
Heeseung Kim
Che Hyun Lee
Jooyoung Choi
Jiheum Yeom
Sungroh Yoon
171
3
0
24 Sep 2024
VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient
  Speaker-Adaptive Text-to-Speech via Autoguidance
VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via AutoguidanceIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Jiheum Yeom
Heeseung Kim
Jooyoung Choi
Che Hyun Lee
Nohil Park
Sungroh Yoon
117
1
0
24 Sep 2024
DiffSSD: A Diffusion-Based Dataset For Speech Forensics
DiffSSD: A Diffusion-Based Dataset For Speech ForensicsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Kratika Bhagtani
Amit Kumar Singh Yadav
Paolo Bestagini
Edward J. Delp
DiffM
188
9
0
19 Sep 2024
Text-to-Speech for Unseen Speakers via Low-Complexity Discrete Unit-Based Frame Selection
Text-to-Speech for Unseen Speakers via Low-Complexity Discrete Unit-Based Frame Selection
Ismail Rasim Ulgen
Shreeram Suresh Chandra
Junchen Lu
Berrak Sisman
967
1
0
30 Aug 2024
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
Wenbin Wang
Yang Song
Sanjay Jha
222
17
0
28 Apr 2024
FairSSD: Understanding Bias in Synthetic Speech Detectors
FairSSD: Understanding Bias in Synthetic Speech Detectors
Amit Kumar Singh Yadav
Kratika Bhagtani
Davide Salvi
Paolo Bestagini
Edward J.Delp
252
12
0
17 Apr 2024
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving
  Zero-Shot Voice Editing
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing
Philip Anastassiou
Zhenyu Tang
Kainan Peng
Dongya Jia
Jiaxin Li
Ming Tu
Yuping Wang
Yuxuan Wang
Mingbo Ma
331
10
0
10 Apr 2024
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text AlignmentIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2024
Hyoung-Seok Oh
Sang-Hoon Lee
Deok-Hyun Cho
Seong-Whan Lee
584
1
0
16 Jan 2024
HierSpeech++: Bridging the Gap between Semantic and Acoustic
  Representation of Speech by Hierarchical Variational Inference for Zero-shot
  Speech Synthesis
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech SynthesisIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Sang-Hoon Lee
Haram Choi
Seung-Bin Kim
Seong-Whan Lee
BDL
389
60
0
21 Nov 2023
Towards generalisable and calibrated synthetic speech detection with
  self-supervised representations
Towards generalisable and calibrated synthetic speech detection with self-supervised representationsInterspeech (Interspeech), 2023
Octavian Pascu
Adriana Stan
Dan Oneaţă
Elisabeta Oneata
H. Cucu
SSL
344
5
0
11 Sep 2023
Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any
  Voice Conversion using Only Speech Data
Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any Voice Conversion using Only Speech Data
Hyungseob Lim
Kyungguen Byun
Sunkuk Moon
Erik Visser
DiffM
278
2
0
06 Sep 2023
1