Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.14762
Cited By
Emotional Voice Conversion: Theory, Databases and ESD
31 May 2021
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Emotional Voice Conversion: Theory, Databases and ESD"
24 / 24 papers shown
Title
Kimi-Audio Technical Report
KimiTeam
Ding Ding
Zeqian Ju
Yichong Leng
S. Liu
...
Z. Yang
Aoxiong Yin
Ruibin Yuan
Y. Zhang
Zaida Zhou
AuLLM
VLM
108
5
0
25 Apr 2025
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
Junyi Ao
Yuancheng Wang
Xiaohai Tian
Dekun Chen
J. Zhang
Lu Lu
Y. Wang
Haizhou Li
Z. Wu
AuLLM
80
17
0
17 Jan 2025
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
Tian-Hao Zhang
Jiawei Zhang
J. Wang
Xinyuan Qian
Xu-cheng Yin
CVBM
45
0
0
02 Jan 2025
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
Ashishkumar Gudmalwar
Ishan D. Biyani
Nirmesh J. Shah
Pankaj Wasnik
R. Shah
DiffM
26
0
0
31 Dec 2024
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector
Deok-Hyeon Cho
Hyung-Seok Oh
Seung-Bin Kim
Seong-Whan Lee
39
3
0
04 Nov 2024
Cross-Lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models
Zhichen Han
Tianqi Geng
Hui Feng
Jiahong Yuan
Korin Richmond
Yuanchao Li
33
1
0
25 Sep 2024
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions
Kun Zhou
You Zhang
Shengkui Zhao
Hao Wang
Zexu Pan
...
Chongjia Ni
Yukun Ma
Trung Hieu Nguyen
J. Yip
Bin Ma
44
5
0
25 Sep 2024
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Jiangyan Yi
Chu Yuan Zhang
Jianhua Tao
Chenglong Wang
Xinrui Yan
Yong Ren
Hao Gu
Junzuo Zhou
50
1
0
09 Aug 2024
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
Wenhan Yao
Jiangkun Yang
yongqiang He
Jia Liu
Weiping Wen
34
1
0
16 Jun 2024
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Sho Inoue
Kun Zhou
Shuai Wang
Haizhou Li
26
7
0
15 May 2024
TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on Self-Supervised Learning and Knowledge Transfer
Noé Tits
Prernna Bhatnagar
Thierry Dutoit
33
0
0
03 May 2024
Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators
Wiebke Hutiri
Orestis Papakyriakopoulos
Alice Xiang
21
16
0
25 Jan 2024
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
21
24
0
08 Nov 2023
Emotion Selectable End-to-End Text-based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
Chu Yuan Zhang
25
2
0
20 Dec 2022
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units
Gallil Maimon
Yossi Adi
21
13
0
19 Dec 2022
EmoFake: An Initial Dataset for Emotion Fake Audio Detection
Yan Zhao
Jiangyan Yi
J. Tao
Chenglong Wang
Xiaohui Zhang
Yongfeng Dong
21
9
0
10 Nov 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
15
53
0
06 Oct 2022
Neural Emotion Director: Speech-preserving semantic control of facial expressions in "in-the-wild" videos
Foivos Paraperas-Papantoniou
P. Filntisis
Petros Maragos
A. Roussos
3DH
CVBM
18
22
0
01 Dec 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
11
24
0
20 Oct 2021
Automatic Speech Recognition And Limited Vocabulary: A Survey
J. L. E. K. Fendji
D. Tala
B. Yenke
M. Atemkeng
13
3
0
23 Aug 2021
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
Kun Zhou
Berrak Sisman
Haizhou Li
10
27
0
31 Mar 2021
ICE-Talk: an Interface for a Controllable Expressive Talking Machine
Noé Tits
Kevin El Haddad
Thierry Dutoit
LLMAG
11
3
0
25 Aug 2020
Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor Network
Ravi Shankar
Hsi-Wei Hsieh
N. Charon
A. Venkataraman
27
11
0
25 Jul 2020
Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network and an Adversarial Pair Discriminator
Ravi Shankar
Jacob Sager
A. Venkataraman
GAN
27
18
0
25 Jul 2020
1