UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

Interspeech (Interspeech), 2023

28 June 2023

Papers citing "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"

17 / 17 papers shown

Entropy-based Coarse and Compressed Semantic Speech Representation Learning

102

30 Aug 2025

Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow MatchingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

222

01 Jun 2025

Voice Cloning: Comprehensive Survey

Hussam Azzuni

Abdulmotaleb El Saddik

VLM

351

01 May 2025

Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching ModelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

...

388

08 Feb 2025

Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody PromptingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

163

31 Dec 2024

Analytic Study of Text-Free Speech Synthesis for Raw Audio using a Self-Supervised Learning ModelAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2024

Joonyong Park

Daisuke Saito

Nobuaki Minematsu

273

04 Dec 2024

NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple SpeakersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Nohil Park

Heeseung Kim

Che Hyun Lee

Jooyoung Choi

Jiheum Yeom

Sungroh Yoon

171

24 Sep 2024

VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via AutoguidanceIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Jiheum Yeom

Heeseung Kim

Jooyoung Choi

Che Hyun Lee

Nohil Park

Sungroh Yoon

117

24 Sep 2024

DiffSSD: A Diffusion-Based Dataset For Speech ForensicsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Kratika Bhagtani

Amit Kumar Singh Yadav

Paolo Bestagini

Edward J. Delp

DiffM

188

19 Sep 2024

Text-to-Speech for Unseen Speakers via Low-Complexity Discrete Unit-Based Frame Selection

Ismail Rasim Ulgen

Shreeram Suresh Chandra

Junchen Lu

Berrak Sisman

967

30 Aug 2024

USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

Wenbin Wang

Yang Song

Sanjay Jha

222

28 Apr 2024

FairSSD: Understanding Bias in Synthetic Speech Detectors

Amit Kumar Singh Yadav

252

17 Apr 2024

VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing

Dongya Jia

Yuxuan Wang

331

10 Apr 2024

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text AlignmentIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2024

584

16 Jan 2024

HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech SynthesisIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023

389

21 Nov 2023

Towards generalisable and calibrated synthetic speech detection with self-supervised representationsInterspeech (Interspeech), 2023

344

11 Sep 2023

Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any Voice Conversion using Only Speech Data

278

06 Sep 2023