ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.10521
  4. Cited By
LM-VC: Zero-shot Voice Conversion via Speech Generation based on
  Language Models
v1v2 (latest)

LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models

IEEE Signal Processing Letters (IEEE SPL), 2023
18 June 2023
Zhichao Wang
Yuan-Jui Chen
Linfu Xie
Qiao Tian
Yuping Wang
ArXiv (abs)PDFHTMLGithub (399★)

Papers citing "LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models"

24 / 24 papers shown
FreeTalk:A plug-and-play and black-box defense against speech synthesis attacks
FreeTalk:A plug-and-play and black-box defense against speech synthesis attacks
Yuwen Pu
Zhou Feng
Chunyi Zhou
Jiahao Chen
Chunqiang Hu
Haibo Hu
S. Ji
AAML
143
0
0
30 Aug 2025
Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion
Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion
Yu Zhang
Baotong Tian
Z. Duan
657
1
0
19 Jul 2025
StarVC: A Unified Auto-Regressive Framework for Joint Text and Speech Generation in Voice Conversion
StarVC: A Unified Auto-Regressive Framework for Joint Text and Speech Generation in Voice Conversion
Fengjin Li
Jie Wang
Yadong Niu
Yongqing Wang
Meng Meng
Jian Luan
Zhiyong Wu
240
0
0
03 Jun 2025
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained RepresentationsIEEE Journal on Selected Topics in Signal Processing (JSTSP), 2024
Xue Jiang
Xiulian Peng
Yuan Zhang
Yan Lu
SSL
419
6
0
15 Mar 2025
AudioMiXR: Spatial Audio Object Manipulation with 6DoF for Sound Design in Augmented Reality
AudioMiXR: Spatial Audio Object Manipulation with 6DoF for Sound Design in Augmented RealityProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2025
Brandon Woodard
Margarita Geleta
Joseph J. LaViola Jr.
Andrea Fanelli
Rhonda Wilson
1.0K
33
0
05 Feb 2025
VoicePrompter: Robust Zero-Shot Voice Conversion with Voice Prompt and Conditional Flow Matching
VoicePrompter: Robust Zero-Shot Voice Conversion with Voice Prompt and Conditional Flow MatchingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Ha-Yeong Choi
Jaehan Park
377
3
0
29 Jan 2025
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial TrainingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Xinfa Zhu
Lei He
Yujia Xiao
Xi Wang
Xu Tan
Sheng Zhao
Lei Xie
DiffM
355
3
0
08 Jan 2025
CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice
  Conversion
CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion
Yuke Li
Xinfa Zhu
Hanzhao Li
Jixun Yao
WenJie Tian
XiPeng Yang
Yunlin Chen
Zhifei Li
Lei Xie
DiffM
552
1
0
28 Nov 2024
Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre Modeling
Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yuguang Yang
Yu Pan
Jixun Yao
Xiang Zhang
Jianhao Ye
Hongbin Zhou
Lei Xie
Lei Ma
Jianjun Zhao
221
2
0
02 Oct 2024
Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models
Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models
Sijing Chen
Qi Liu
Laipeng He
Tianwei He
Wendi He
...
Huimin Zhang
Xiang Zhang
Guangcheng Zhao
Hongbin Zhou
Pengpeng Zou
355
14
0
18 Sep 2024
StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice
  Conversion
StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice ConversionIEEE Signal Processing Letters (SPL), 2024
Zhichao Wang
Yuanzhe Chen
Xinsheng Wang
Lei Xie
Yuping Wang
326
4
0
05 Aug 2024
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like
  Spontaneous Representation
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation
Xinhan Di
Jiahao Lu
Yunming Liang
Junjie Zheng
Yihua Wang
Chaofan Ding
ALM
334
3
0
01 Aug 2024
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with
  Progressive Constraints in a Dual-mode Training Strategy
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training StrategyInterspeech (Interspeech), 2024
Linhan Ma
Xinfa Zhu
Yuanjun Lv
Zhichao Wang
Ziqian Wang
Wendi He
Hongbin Zhou
Lei Xie
183
6
0
14 Jun 2024
Addressing Index Collapse of Large-Codebook Speech Tokenizer with
  Dual-Decoding Product-Quantized Variational Auto-Encoder
Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder
Haohan Guo
Fenglong Xie
Dongchao Yang
Hui Lu
Xixin Wu
Helen Meng
284
8
0
05 Jun 2024
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Philip Anastassiou
Jiawei Chen
Jingshu Chen
Yuanzhe Chen
Zhuo Chen
...
Wenjie Zhang
Yanzhe Zhang
Zilin Zhao
Dejian Zhong
Xiaobin Zhuang
406
316
0
04 Jun 2024
A Survey of Deep Learning Audio Generation Methods
A Survey of Deep Learning Audio Generation Methods
Matej Bozic
Marko Horvat
VLMMedIm
351
9
0
31 May 2024
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
  Diffusion Models
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Zeqian Ju
Yuancheng Wang
Kai Shen
Xu Tan
Detai Xin
...
Shikun Zhang
Jiang Bian
Lei He
Jinyu Li
Sheng Zhao
DiffM
562
325
0
05 Mar 2024
StreamVoice: Streamable Context-Aware Language Modeling for Real-time
  Zero-Shot Voice Conversion
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice ConversionAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Zhichao Wang
Yuan-Jui Chen
Xinsheng Wang
Lei Xie
Yuping Wang
381
18
0
19 Jan 2024
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross
  Attention
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross AttentionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Junjie Li
Yiwei Guo
Xie Chen
Kai Yu
347
31
0
14 Dec 2023
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice
  Conversion
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion
A. R. Bargum
Stefania Serafin
Cumhur Erkut
325
9
0
14 Nov 2023
Vec-Tok Speech: speech vectorization and tokenization for neural speech
  generation
Vec-Tok Speech: speech vectorization and tokenization for neural speech generationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023
Xinfa Zhu
Yuanjun Lv
Yinjiao Lei
Tao Li
Wendi He
Hongbin Zhou
Heng Lu
Lei Xie
443
30
0
11 Oct 2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBMAuLLM
646
193
0
01 Oct 2023
Speaker anonymization using neural audio codec language models
Speaker anonymization using neural audio codec language modelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Michele Panariello
Francesco Nespoli
Massimiliano Todisco
Nicholas W. D. Evans
256
46
0
25 Sep 2023
Sparks of Large Audio Models: A Survey and Outlook
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Xiaoshi Zhong
Björn W. Schuller
LM&MAAuLLM
833
56
0
24 Aug 2023
1
Page 1 of 1