ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.07405
  4. Cited By
FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit
  for Neural Speech Codec

FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec

14 September 2023
Zhihao Du
Shiliang Zhang
Kai Hu
Siqi Zheng
ArXivPDFHTML

Papers citing "FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec"

40 / 40 papers shown
Title
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Dianwen Ng
Kun Zhou
Yi-Wen Chao
Zhiwei Xiong
B. Ma
E. Chng
23
0
0
12 May 2025
Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Yifan Yang
S. Liu
J. Li
Yuxuan Hu
Haibin Wu
...
Haiyang Sun
Yanqing Liu
Yan Lu
Kai Yu
Xie Chen
23
0
0
14 Apr 2025
LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models
LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models
Beilong Tang
Bang Zeng
Ming Li
AI4TS
34
0
0
10 Apr 2025
One Quantizer is Enough: Toward a Lightweight Audio Codec
One Quantizer is Enough: Toward a Lightweight Audio Codec
Linwei Zhai
H. Ding
Cui Zhao
Fei-Yue Wang
Ge Wang
Wang Zhi
Wei Xi
MQ
27
0
0
07 Apr 2025
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
Y. Jiang
Qian Chen
Shengpeng Ji
Yu Xi
Wen Wang
C. Zhang
Xianghu Yue
Shiliang Zhang
H. Li
52
0
0
27 Feb 2025
AudioMiXR: Spatial Audio Object Manipulation with 6DoF for Sound Design in Augmented Reality
AudioMiXR: Spatial Audio Object Manipulation with 6DoF for Sound Design in Augmented Reality
Brandon Woodard
Margarita Geleta
Joseph J. LaViola Jr.
Andrea Fanelli
Rhonda Wilson
55
2
0
05 Feb 2025
ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling
ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling
Yi-Chiao Wu
Dejan Marković
Steven Krenn
I. D. Gebru
Alexander Richard
61
0
0
04 Feb 2025
Rate-Aware Learned Speech Compression
Rate-Aware Learned Speech Compression
Jun Xu
Zhengxue Cheng
Guangchuan Chi
Yuhan Liu
Yuelin Hu
Li-Na Song
35
0
0
21 Jan 2025
Optimizing Neural Speech Codec for Low-Bitrate Compression via
  Multi-Scale Encoding
Optimizing Neural Speech Codec for Low-Bitrate Compression via Multi-Scale Encoding
Peiji Yang
Fengping Wang
Yicheng Zhong
Huawei Wei
Zhisheng Wang
18
0
0
21 Oct 2024
GAN-Based Speech Enhancement for Low SNR Using Latent Feature
  Conditioning
GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning
Shrishti Saha Shetu
Emanuël A. P. Habets
Andreas Brendel
21
1
0
17 Oct 2024
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long
  Zero-Shot Text-to-Speech Synthesis
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura
Takumi Hirose
Masanari Ohi
Hideki Nakayama
Nakamasa Inoue
VLM
29
1
0
06 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Alan Baade
Puyuan Peng
David F. Harwath
45
3
0
05 Oct 2024
Zero-Shot Text-to-Speech from Continuous Text Streams
Zero-Shot Text-to-Speech from Continuous Text Streams
Trung D. Q. Dang
David Aponte
Dung Tran
Tianyi Chen
K. Koishida
AuLLM
VLM
29
3
0
01 Oct 2024
Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for
  Neural Codec Language Models
Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for Neural Codec Language Models
Wenrui Liu
Zhifang Guo
Jin Xu
Yuanjun Lv
Yunfei Chu
Zhou Zhao
Junyang Lin
41
1
0
28 Sep 2024
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
N. Pia
Martin Strauss
M. Multrus
B. Edler
23
0
0
26 Sep 2024
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec
  models
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Haibin Wu
Xuanjun Chen
Yi-Cheng Lin
Kaiwei Chang
Jiawei Du
...
Yi-Chiao Wu
Xu Tan
James Glass
Shinji Watanabe
Hung-yi Lee
24
6
0
21 Sep 2024
MuCodec: Ultra Low-Bitrate Music Codec
MuCodec: Ultra Low-Bitrate Music Codec
Yaoxun Xu
Hangting Chen
Jianwei Yu
Wei Tan
Rongzhi Gu
Shun Lei
Zhiwei Lin
Zhiyong Wu
30
1
0
20 Sep 2024
Preference Alignment Improves Language Model-Based TTS
Preference Alignment Improves Language Model-Based TTS
Jinchuan Tian
Chunlei Zhang
Jiatong Shi
Hao Zhang
Jianwei Yu
Shinji Watanabe
Dong Yu
25
7
0
19 Sep 2024
SSDM: Scalable Speech Dysfluency Modeling
SSDM: Scalable Speech Dysfluency Modeling
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
30
1
0
29 Aug 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji
Ziyue Jiang
Xize Cheng
Yifu Chen
Minghui Fang
...
Rongjie Huang
Yidi Jiang
Qian Chen
Zhou Zhao
Zhou Zhao
VLM
52
32
0
29 Aug 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for
  Natural Interaction Between Humans and LLMs
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
Keyu An
Qian Chen
Chong Deng
Zhihao Du
Changfeng Gao
...
Bin Zhang
Qinglin Zhang
Shiliang Zhang
Nan Zhao
Siqi Zheng
AuLLM
27
42
0
04 Jul 2024
DASB -- Discrete Audio and Speech Benchmark
DASB -- Discrete Audio and Speech Benchmark
Pooneh Mousavi
Luca Della Libera
J. Duret
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
35
12
0
20 Jun 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi
J. Duret
Salah Zaiem
Luca Della Libera
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
34
9
0
15 Jun 2024
Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio
Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio
Yi Lu
Yuankun Xie
Ruibo Fu
Zhengqi Wen
Jianhua Tao
...
Xuefei Liu
Yongwei Li
Yukun Liu
Xiaopeng Wang
Shuchen Shi
32
1
0
12 Jun 2024
CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from
  Codec-Based Speech Synthesis Systems
CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems
Haibin Wu
Yuan Tseng
Hung-yi Lee
AuLLM
24
6
0
11 Jun 2024
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis
Zhijun Liu
Shuai Wang
Sho Inoue
Qibing Bai
Haizhou Li
DiffM
34
15
0
08 Jun 2024
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text
  to Speech Synthesizers
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
Sanyuan Chen
Shujie Liu
Long Zhou
Yanqing Liu
Xu Tan
Jinyu Li
Sheng Zhao
Yao Qian
Furu Wei
VLM
39
64
0
08 Jun 2024
Neural Codec-based Adversarial Sample Detection for Speaker Verification
Neural Codec-based Adversarial Sample Detection for Speaker Verification
Xuanjun Chen
Jiawei Du
Haibin Wu
Jyh-Shing Roger Jang
Hung-yi Lee
24
2
0
07 Jun 2024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and
  Zero-shot Language Style Control With Decoupled Codec
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
Shengpeng Ji
Jia-li Zuo
Minghui Fang
Siqi Zheng
Qian Chen
...
Ziyue Jiang
Hai Huang
Xize Cheng
Rongjie Huang
Zhou Zhao
45
8
0
03 Jun 2024
The Codecfake Dataset and Countermeasures for the Universally Detection
  of Deepfake Audio
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio
Yuankun Xie
Yi Lu
Ruibo Fu
Zhengqi Wen
Zhiyong Wang
...
Xiaopeng Wang
Yukun Liu
Haonan Cheng
Long Ye
Yi Sun
38
15
0
08 May 2024
HILCodec: High Fidelity and Lightweight Neural Audio Codec
HILCodec: High Fidelity and Lightweight Neural Audio Codec
S. Ahn
Beom Jun Woo
Mingrui Han
Chanyeong Moon
Nam Soo Kim
19
6
0
08 May 2024
ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized
  Transformers
ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
Yuzhe Gu
Enmao Diao
21
4
0
30 Apr 2024
An Investigation of Time-Frequency Representation Discriminators for
  High-Fidelity Vocoder
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Haizhou Li
Zhizheng Wu
28
2
0
26 Apr 2024
The X-LANCE Technical Report for Interspeech 2024 Speech Processing
  Using Discrete Speech Unit Challenge
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge
Yiwei Guo
Chenrun Wang
Yifan Yang
Hankun Wang
Ziyang Ma
...
Hanzheng Li
Shuai Fan
Hui Zhang
Xie Chen
Kai Yu
28
1
0
09 Apr 2024
Towards audio language modeling -- an overview
Towards audio language modeling -- an overview
Haibin Wu
Xuanjun Chen
Yi-Cheng Lin
Kai-Wei Chang
Ho-Lam Chung
Alexander H. Liu
Hung-yi Lee
AuLLM
30
28
0
20 Feb 2024
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models
Haibin Wu
Ho-Lam Chung
Yi-Cheng Lin
Yuan-Kuei Wu
Xuanjun Chen
Yu-Chi Pai
Hsiu-Hsuan Wang
Kai-Wei Chang
Alexander H. Liu
Hung-yi Lee
41
18
0
20 Feb 2024
Language-Codec: Reducing the Gaps Between Discrete Codec Representation
  and Speech Language Models
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
Shengpeng Ji
Minghui Fang
Ziyue Jiang
Siqi Zheng
Qian Chen
Rongjie Huang
Jialung Zuo
Shulei Wang
Zhou Zhao
AuLLM
24
16
0
19 Feb 2024
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Zhihao Du
Jiaming Wang
Qian Chen
Yunfei Chu
Zhifu Gao
...
Wen Wang
Siqi Zheng
Chang Zhou
Zhijie Yan
Shiliang Zhang
LLMAG
VLM
AuLLM
LM&MA
31
79
0
07 Oct 2023
LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Teerapat Jenrungrot
Michael Chinen
W. Kleijn
Jan Skoglund
Zalan Borsos
Neil Zeghidour
Marco Tagliasacchi
52
19
0
23 Mar 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
43
637
0
05 Jan 2023
1