Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.11284
Cited By
Unsupervised Speech Decomposition via Triple Information Bottleneck
23 April 2020
Kaizhi Qian
Yang Zhang
Shiyu Chang
David D. Cox
M. Hasegawa-Johnson
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unsupervised Speech Decomposition via Triple Information Bottleneck"
43 / 43 papers shown
Title
Bird Vocalization Embedding Extraction Using Self-Supervised Disentangled Representation Learning
Runwu Shi
Katsutoshi Itoyama
K. Nakadai
SSL
DRL
39
1
0
31 Dec 2024
Voice Conversion-based Privacy through Adversarial Information Hiding
J. Webber
O. Watts
G. Henter
Jennifer Williams
Simon King
45
0
0
23 Sep 2024
Prosody-Driven Privacy-Preserving Dementia Detection
Dominika Woszczyk
Ranya Aloufi
Soteris Demetriou
34
2
0
03 Jul 2024
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
Wenhan Yao
Jiangkun Yang
yongqiang He
Jia Liu
Weiping Wen
44
1
0
16 Jun 2024
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy
Linhan Ma
Xinfa Zhu
Yuanjun Lv
Zhichao Wang
Ziqian Wang
Wendi He
Hongbin Zhou
Lei Xie
42
2
0
14 Jun 2024
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot Voice Conversion
Pengcheng Li
Jianzong Wang
Xulong Zhang
Yong Zhang
Jing Xiao
Ning Cheng
DRL
33
1
0
02 May 2024
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention
Junjie Li
Yiwei Guo
Xie Chen
Kai Yu
38
13
0
14 Dec 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
21
24
0
08 Nov 2023
VaSAB: The variable size adaptive information bottleneck for disentanglement on speech and singing voice
F. Bous
Axel Roebel
16
0
0
05 Oct 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Ruiqi Li
Rongjie Huang
Lichao Zhang
Jinglin Liu
Zhou Zhao
27
4
0
08 May 2023
Label Information Bottleneck for Label Enhancement
Qinghai Zheng
Jihua Zhu
Haoyu Tang
31
6
0
13 Mar 2023
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units
Gallil Maimon
Yossi Adi
21
13
0
19 Dec 2022
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
22
6
0
12 Nov 2022
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Dongchao Yang
Songxiang Liu
Jianwei Yu
Helin Wang
Chao Weng
Yuexian Zou
DiffM
VLM
33
18
0
04 Nov 2022
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
19
1
0
25 Oct 2022
DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion
Chihiro Watanabe
Hirokazu Kameoka
DRL
24
0
0
20 Oct 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Mei-Shuo Chen
Z. Duan
22
10
0
23 Sep 2022
Non-Parallel Voice Conversion for ASR Augmentation
Gary Wang
Andrew Rosenberg
Bhuvana Ramabhadran
Fadi Biadsy
Yinghui Huang
Jesse Emond
P. M. Mengibar
13
2
0
15 Sep 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre
Guangyan Zhang
Ying Qin
W. Zhang
Jialun Wu
Mei Li
Yu Gai
Feijun Jiang
Tan Lee
48
26
0
29 Jun 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
24
110
0
20 Apr 2022
Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment
Tobias Weise
P. Klumpp
Kubilay Can Demir
Andreas K. Maier
E. Noeth
B.J. Heismann
Maria Schuster
S. Yang
11
3
0
08 Apr 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion
Weida Liang
Lantian Li
Wenqiang Du
Dong Wang
43
0
0
08 Apr 2022
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion
Xintao Zhao
Feng Liu
Changhe Song
Zhiyong Wu
Shiyin Kang
Deyi Tuo
H. Meng
16
20
0
24 Mar 2022
CGIBNet: Bandwidth-constrained Communication with Graph Information Bottleneck in Multi-Agent Reinforcement Learning
Qi Tian
Kun Kuang
Baoxiang Wang
Furui Liu
Fei Wu
26
0
0
20 Dec 2021
Improving Subgraph Recognition with Variational Graph Information Bottleneck
Junchi Yu
Jie Cao
Ran He
22
53
0
18 Dec 2021
How Speech is Recognized to Be Emotional - A Study Based on Information Decomposition
Haoran Sun
Lantian Li
T. Zheng
Dong Wang
CVBM
14
0
0
24 Nov 2021
Zero-shot Singing Technique Conversion
Brendan T. O'Connor
S. Dixon
Georgy Fazekas
27
5
0
16 Nov 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Shijun Wang
Dimche Kostadinov
Damian Borth
19
10
0
27 Oct 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
11
24
0
20 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Axel Roebel
F. Bous
22
2
0
07 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
Jen-Hao Rick Chang
A. Shrivastava
H. Koppula
Xiaoshuai Zhang
Oncel Tuzel
DiffM
51
16
0
06 Oct 2021
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation
Yuanxun Lu
Jinxiang Chai
Xun Cao
29
82
0
22 Sep 2021
Complementing Handcrafted Features with Raw Waveform Using a Light-weight Auxiliary Model
Zhongwei Teng
Quchen Fu
Jules White
Maria E. Powell
Douglas C. Schmidt
13
5
0
06 Sep 2021
Learning De-identified Representations of Prosody from Raw Audio
J. Weston
R. Lenain
U. Meepegama
E. Fristed
SSL
24
15
0
17 Jul 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
M. Kang
Sungjae Kim
Injung Kim
23
3
0
21 Jun 2021
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Disong Wang
Liqun Deng
Y. Yeung
Xiao Chen
Xunying Liu
H. Meng
DRL
14
136
0
18 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
21
24
0
20 Apr 2021
Semi-supervised Learning for Singing Synthesis Timbre
J. Bonada
Merlijn Blaauw
19
4
0
05 Nov 2020
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization
Yen-Hao Chen
Da-Yi Wu
Tsung-Han Wu
Hung-yi Lee
23
107
0
31 Oct 2020
Graph Information Bottleneck for Subgraph Recognition
Junchi Yu
Tingyang Xu
Yu Rong
Yatao Bian
Junzhou Huang
R. He
17
153
0
12 Oct 2020
Contrastive Predictive Coding Supported Factorized Variational Autoencoder for Unsupervised Learning of Disentangled Speech Representations
Janek Ebbers
Michael Kuhlmann
Tobias Cord-Landwehr
Reinhold Haeb-Umbach
DRL
CoGe
SSL
23
4
0
26 May 2020
Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization
Jen-Yu Liu
Yu-Hua Chen
Yin-Cheng Yeh
Yi-Hsuan Yang
GAN
32
35
0
18 May 2020
1