v1v2v3v4v5 (latest)

Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data

The Speaker and Language Recognition Workshop (Odyssey), 2020

1 February 2020

Kun Zhou

Berrak Sisman

Haizhou Li

ArXiv (abs)PDF HTML

Papers citing "Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data"

43 / 43 papers shown

Randomness from causally independent processes

219

06 Oct 2025

Textless and Non-Parallel Speech-to-Speech Emotion Style Transfer

Soumya Dutta

Avni Jain

Sriram Ganapathy

317

23 May 2025

A Review of Human Emotion Synthesis Based on Generative Technology

...

318

10 Dec 2024

Improving speaker verification robustness with synthetic emotional utterances

Nikhil Kumar Koditala

327

30 Nov 2024

Re-ENACT: Reinforcement Learning for Emotional Speech Generation using Actor-Critic Strategy

Ravi Shankar

Archana Venkataraman

203

04 Aug 2024

Fine-Grained Quantitative Emotion Editing for Speech Generation

Sho Inoue

Kun Zhou

Shuai Wang

Haizhou Li

277

04 Mar 2024

Attention-based Interactive Disentangling Network for Instance-level Emotional Voice ConversionInterspeech (Interspeech), 2023

Yun Chen

Lingxiao Yang

Qi Chen

Jianhuang Lai

Xiaohua Xie

179

29 Dec 2023

Towards General-Purpose Text-Instruction-Guided Voice ConversionAutomatic Speech Recognition & Understanding (ASRU), 2023

Hung-yi Lee

378

25 Sep 2023

EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

N. Prabhu

Bunlong Lay

Simon Welker

N. Lehmann-Willenbrock

Timo Gerkmann

DiffM

383

14 Sep 2023

In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis

N. Prabhu

N. Lehmann-Willenbrock

Timo Gerkmann

231

02 Jun 2023

Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain PairingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

238

21 Feb 2023

Emotion Selectable End-to-End Text-based Speech EditingArtificial Intelligence (AI), 2022

Tao Wang

Jiangyan Yi

225

20 Dec 2022

Disentangling Prosody Representations with Unsupervised Speech ReconstructionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Leyuan Qu

Taiha Li

C. Weber

Theresa Pekarek-Rosin

F. Ren

S. Wermter

280

14 Dec 2022

Improving Speech Emotion Recognition with Unsupervised Speaking Style TransferIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

285

16 Nov 2022

EmoFake: An Initial Dataset for Emotion Fake Audio DetectionChina National Conference on Chinese Computational Linguistics (CCL), 2022

Jiangyan Yi

Xiaohui Zhang

217

10 Nov 2022

A Diffeomorphic Flow-based Variational Framework for Multi-speaker Emotion ConversionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

199

09 Nov 2022

Mixed-EVC: Mixed Emotion Synthesis and Control in Voice ConversionThe Speaker and Language Recognition Workshop (Odyssey), 2022

Haizhou Li

356

25 Oct 2022

An Overview of Affective Speech Synthesis and Conversion in the Deep Learning EraProceedings of the IEEE (Proc. IEEE), 2022

Andreas Triantafyllopoulos

Björn W. Schuller

...

309

06 Oct 2022

Speech Synthesis with Mixed EmotionsIEEE Transactions on Affective Computing (IEEE TAC), 2022

Haizhou Li

369

11 Aug 2022

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentationInterspeech (Interspeech), 2022

207

29 Jul 2022

Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech SynthesisInterspeech (Interspeech), 2022

264

04 Jul 2022

An Overview & Analysis of Sequence-to-Sequence Emotional Voice ConversionInterspeech (Interspeech), 2022

Zijiang Yang

Xin Jing

Andreas Triantafyllopoulos

Meishu Song

Ilhan Aslan

Björn W. Schuller

263

29 Mar 2022

Emotion Intensity and its Control for Emotional Voice ConversionIEEE Transactions on Affective Computing (IEEE TAC), 2022

Kun Zhou

Berrak Sisman

R. Rana

Björn W. Schuller

Haizhou Li

432

10 Jan 2022

CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer

167

30 Nov 2021

Textless Speech Emotion Conversion using Discrete and Decomposed RepresentationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Yossi Adi

398

14 Nov 2021

Towards Identity Preserving Normal to Dysarthric Voice Conversion

Wen-Chin Huang

B. Halpern

Lester Phillip Violeta

O. Scharenborg

Tomoki Toda

326

15 Oct 2021

Decoupling Speaker-Independent Emotions for Voice Conversion Via Source-Filter Networks

185

04 Oct 2021

Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style TransferAutomatic Speech Recognition & Understanding (ASRU), 2021

Zongyang Du

Berrak Sisman

Kun Zhou

Haizhou Li

331

08 Jul 2021

Global Rhythm Style Transfer Without Text Transcriptions

Kaizhi Qian

Yang Zhang

Shiyu Chang

Jinjun Xiong

Chuang Gan

David D. Cox

M. Hasegawa-Johnson

279

16 Jun 2021

Emotional Voice Conversion: Theory, Databases and ESDSpeech Communication (Speech Commun.), 2021

Kun Zhou

Berrak Sisman

Rui Liu

Haizhou Li

534

264

31 May 2021

MASS: Multi-task Anthropomorphic Speech Synthesis FrameworkComputer Speech and Language (CSL), 2021

Jinyin Chen

Linhui Ye

Zhaoyan Ming

154

10 May 2021

Towards end-to-end F0 voice conversion based on Dual-GAN with convolutional wavelet kernelsEuropean Signal Processing Conference (EUSIPCO), 2021

Clément Le Moine Veillon

Nicolas Obin

Axel Roebel

160

15 Apr 2021

Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence TrainingInterspeech (Interspeech), 2021

Kun Zhou

Berrak Sisman

Haizhou Li

407

31 Mar 2021

EmoCat: Language-agnostic Emotional Voice ConversionSpeech Synthesis Workshop (SSW), 2021

233

14 Jan 2021

VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech

Kun Zhou

Berrak Sisman

Haizhou Li

DRL

384

03 Nov 2020

Seen and Unseen emotional style transfer for voice conversion with a new emotional speech datasetIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Kun Zhou

Berrak Sisman

Rui Liu

Haizhou Li

376

257

28 Oct 2020

Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGANAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2020

Zongyang Du

Kun Zhou

Berrak Sisman

Haizhou Li

318

11 Aug 2020

VAW-GAN for Singing Voice Conversion with Non-parallel Training DataAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2020

Haizhou Li

208

10 Aug 2020

An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep LearningIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

Haizhou Li

643

413

09 Aug 2020

Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor NetworkInterspeech (Interspeech), 2020

249

25 Jul 2020

Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network and an Adversarial Pair DiscriminatorInterspeech (Interspeech), 2020

298

25 Jul 2020

Incorporating Reinforced Adversarial Learning in Autoregressive Image Generation

250

20 Jul 2020

Converting Anyone's Emotion: Towards Speaker-Independent Emotional Voice Conversion

Kun Zhou

Berrak Sisman

Mingyang Zhang

Haizhou Li

360

13 May 2020