v1v2 (latest)

EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion

4 July 2021

Xin Jiang

Papers citing "EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion"

31 / 31 papers shown

Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba

Baher Mohammad

Magauiya Zhussip

Stamatios Lefkimmiatis

Mamba

148

06 Oct 2025

Instance-Specific Test-Time Training for Speech Editing in the Wild

202

16 Jun 2025

PartialEdit: Identifying Partial Deepfakes in the Era of Neural Speech Editing

139

03 Jun 2025

SeamlessEdit: Background Noise Aware Zero-Shot Speech Editing with in-Context Enhancement

Kuan-Yu Chen

Jeng-Lin Li

Jian-Jiun Ding

300

20 May 2025

SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors

211

20 Mar 2025

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis

376

03 Jan 2025

DiffEditor: Enhancing Speech Editing with Semantic Enrichment and Acoustic Consistency

Yang Chen

Yuhang Jia

Shiwan Zhao

Ziyue Jiang

Haoran Li

Jiarong Kang

Yong Qin

141

19 Sep 2024

SongCreator: Lyrics-based Universal Song GenerationNeural Information Processing Systems (NeurIPS), 2024

Shun Lei

Zhiyong Wu

Helen Meng

285

09 Sep 2024

Automatic Voice Identification after Speech Resynthesis using PPGThe Speaker and Language Recognition Workshop (Odyssey), 2024

186

05 Aug 2024

Speech Editing -- a Summary

Tobias Kässmann

Yining Liu

Danni Liu

149

24 Jul 2024

Autoregressive Diffusion Transformer for Text-to-Speech Synthesis

Zhijun Liu

Haizhou Li

184

08 Jun 2024

VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

Puyuan Peng

Po-Yao (Bernie) Huang

Daniel Li

Abdelrahman Mohamed

David Harwath

439

146

25 Mar 2024

AttentionStitch: How Attention Solves the Speech Editing Problem

Antonios Alexos

Pierre Baldi

206

05 Mar 2024

Fine-Grained Quantitative Emotion Editing for Speech Generation

Sho Inoue

Kun Zhou

Shuai Wang

Haizhou Li

214

04 Mar 2024

uSee: Unified Speech Enhancement and Editing with Conditional Diffusion ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Bhiksha Raj

Dong Yu

DiffM

159

02 Oct 2023

FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody ConsistencyInterspeech (Interspeech), 2023

369

21 Sep 2023

Cross-Utterance Conditioned VAE for Speech GenerationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Guangzhi Sun

...

Wei Pan

192

08 Sep 2023

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation

Chenpeng Du

Xie Chen

140

14 Jun 2023

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

...

Rongjie Huang

Chunfeng Wang

Xiang Yin

Zejun Ma

Zhou Zhao

DiffM

256

06 Jun 2023

FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Rongjie Huang

Zhou Zhao

153

23 May 2023

DiffVoice: Text-to-Speech with Latent DiffusionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Zhijun Liu

Yiwei Guo

K. Yu

DiffM

171

23 Apr 2023

Emotion Selectable End-to-End Text-based Speech EditingArtificial Intelligence (AI), 2022

Tao Wang

Jiangyan Yi

169

20 Dec 2022

MaskedSpeech: Context-aware Speech Synthesis with Masking StrategyInterspeech (Interspeech), 2022

140

11 Nov 2022

ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech

...

238

07 Nov 2022

Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders

142

28 Oct 2022

The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an UtteranceIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Xin Wang

304

11 Apr 2022

A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech
Synthesis and Editing

^3

T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and EditingInternational Conference on Machine Learning (ICML), 2022

249

18 Mar 2022

CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech EditingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Tao Wang

Jiangyan Yi

123

21 Feb 2022

SpeechPainter: Text-conditioned Speech InpaintingInterspeech (Interspeech), 2022

Zalan Borsos

Matthew Sharifi

Marco Tagliasacchi

174

15 Feb 2022

Environment Aware Text-to-Speech SynthesisInterspeech (Interspeech), 2021

Daxin Tan

Guangyan Zhang

Tan Lee

209

08 Oct 2021

EdiTTS: Score-based Editing for Controllable Text-to-Speech

Jaesung Tae

Hyeongju Kim

Taesu Kim

DiffM

395

06 Oct 2021