Title
Synthesizing Speech from Intracranial Depth Electrodes using an Encoder-Decoder Framework Jonas Köhler Maarten C. Ottenhoff Sophocles Goulis Miguel Angrick A. Colon Louis Wagner S. Tousseyn P. Kubben Christian Herff 52 28 0 02 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses Shengyuan Xu Wenxiao Zhao Jing Guo 63 12 0 01 Nov 2021
VRAIN-UPV MLLP's system for the Blizzard Challenge 2021 A. P. D. Martos Albert Sanchis Alfons Juan-Císcar 114 6 0 29 Oct 2021
TorchAudio: Building Blocks for Audio and Speech Processing Yao-Yuan Yang Moto Hira Zhaoheng Ni Anjali Chourdia Artyom Astafurov ... Mehrzad Samadi Shinji Watanabe Soumith Chintala Vincent Quenneville-Bélair Yangyang Shi 106 170 0 28 Oct 2021
Assessing Evaluation Metrics for Speech-to-Speech Translation Elizabeth Salesky Julian Mäder Severin Klinger 74 15 0 26 Oct 2021
Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against ASRs H. Abdullah Muhammad Sajidur Rahman Christian Peeters Cassidy Gibson Washington Garcia Vincent Bindschaedler T. Shrimpton Patrick Traynor AAML 48 10 0 25 Oct 2021
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021 Yanqing Liu Rui Shao G. Wang Kuan Chen Bohan Li Pong C. Yuen Jinzhu Li Lei He Sheng Zhao 91 55 0 25 Oct 2021
Discrete Acoustic Space for an Efficient Sampling in Neural Text-To-Speech Mu Li Jonas Rohnke Antonio Bonafonte Mateusz Lajszczak Trevor Wood DRL 100 2 0 24 Oct 2021
Synt++: Utilizing Imperfect Synthetic Data to Improve Speech Recognition Ting-Yao Hu Mohammadreza Armandpour A. Shrivastava Jen-Hao Rick Chang H. Koppula Oncel Tuzel SyDa 87 42 0 21 Oct 2021
Speech Pattern based Black-box Model Watermarking for Automatic Speech Recognition Haozhe Chen Weiming Zhang Kunlin Liu Kejiang Chen Han Fang Nenghai Yu 37 4 0 19 Oct 2021
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation Fengyu Yang Jian Luan Yujun Wang 137 5 0 19 Oct 2021
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge Mutian He Jingzhou Yang Lei He Frank Soong 86 1 0 19 Oct 2021
Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor Anchit Gupta Faizan Farooq Khan Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar CVBM 78 6 0 16 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts Chenxu Hu Qiao Tian Tingle Li Yuping Wang Yuxuan Wang Hang Zhao DiffM VGen 99 43 0 15 Oct 2021
From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation Danni Liu Changhan Wang Hongyu Gong Xutai Ma Yun Tang J. Pino 98 4 0 15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research Tomoki Hayashi Ryuichi Yamamoto Takenori Yoshimura Peter Wu Jiatong Shi Takaaki Saeki Yooncheol Ju Yusuke Yasuda Shinnosuke Takamichi Shinji Watanabe VLM 85 63 0 15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation Rongjie Huang Chenye Cui Feiyang Chen Yi Ren Jinglin Liu Zhou Zhao Baoxing Huai N. Yuan GAN 203 63 0 14 Oct 2021
FedSpeech: Federated Text-to-Speech with Continual Learning Ziyue Jiang Yi Ren Ming Lei Zhou Zhao FedML 166 28 0 14 Oct 2021
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data Haitong Zhang Yue Lin 56 0 0 14 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Junyi Ao Rui Wang Long Zhou Chengyi Wang Shuo Ren ... Yu Zhang Zhihua Wei Yao Qian Jinyu Li Furu Wei 168 203 0 14 Oct 2021
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech Haoyue Zhan Xinyuan Yu Haitong Zhang Yang Zhang Yue Lin 50 5 0 14 Oct 2021
Revisiting IPA-based Cross-lingual Text-to-speech Haitong Zhang Haoyue Zhan Yang Zhang Xinyuan Yu Yue Lin 61 7 0 14 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis Soonbeom Choi Juhan Nam 67 14 0 13 Oct 2021
Fine-grained style control in Transformer-based Text-to-speech Synthesis Li-Wei Chen Alexander I. Rudnicky 169 31 0 12 Oct 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations Wen-Chin Huang Shu-Wen Yang Tomoki Hayashi Hung-yi Lee Shinji Watanabe Tomoki Toda 71 40 0 12 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning Paarth Neekhara Jason Chun Lok Li Boris Ginsburg 144 15 0 12 Oct 2021
Complex Network-Based Approach for Feature Extraction and Classification of Musical Genres M. Pimenta-Zanon G. Bressan Fabricio M. Lopes 27 1 0 09 Oct 2021
PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS With Accurate Phoneme Duration Control Yunchao He Jian Luan Yujun Wang 112 1 0 09 Oct 2021
Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis Mu Yang Shaojin Ding Tianlong Chen Tong Wang Zhangyang Wang CLL 73 5 0 09 Oct 2021
Using multiple reference audios and style embedding constraints for speech synthesis Cheng Gong Longbiao Wang Zhenhua Ling Ju Zhang Jianwu Dang 48 5 0 09 Oct 2021
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech Pengfei Wu Junjie Pan Chenchang Xu Junhui Zhang Lin Wu Xiang Yin Zejun Ma 72 16 0 08 Oct 2021
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms Chien-Feng Liao Jen-Yu Liu Yi-Hsuan Yang 66 5 0 08 Oct 2021
Environment Aware Text-to-Speech Synthesis Daxin Tan Guangyan Zhang Tan Lee 74 4 0 08 Oct 2021
A study on the efficacy of model pre-training in developing neural text-to-speech system Guangyan Zhang Yichong Leng Daxin Tan Ying Qin Kaitao Song Xu Tan Sheng Zhao Tan Lee 58 2 0 08 Oct 2021
Voice Reenactment with F0 and timing constraints and adversarial learning of conversions F. Bous L. Benaroya Nicolas Obin Axel Roebel 52 2 0 07 Oct 2021
Cloning one's voice using very limited data in the wild Dongyang Dai Yuan-Jui Chen Li Chen Ming Tu Lu Liu Rui Xia Qiao Tian Yuping Wang Yuxuan Wang SyDa 61 9 0 07 Oct 2021
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over Junchen Lu Berrak Sisman Rui Liu Mingyang Zhang Haizhou Li DiffM 91 20 0 07 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet Axel Roebel F. Bous 56 2 0 07 Oct 2021
Automated Testing of AI Models Swagatam Haldar Deepak Vijaykeerthy Diptikalyan Saha VLM 44 0 0 07 Oct 2021
Emphasis control for parallel neural TTS Shreyas Seshadri T. Raitio D. Castellani Jiangchuan Li 120 11 0 06 Oct 2021
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS T. Raitio Jiangchuan Li Shreyas Seshadri 78 23 0 06 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models Jen-Hao Rick Chang A. Shrivastava H. Koppula Xiaoshuai Zhang Oncel Tuzel DiffM 111 16 0 06 Oct 2021
An Investigation of the Effectiveness of Phase for Audio Classification Shunsuke Hidaka Kohei Wakamiya T. Kaburagi 28 4 0 06 Oct 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 46 2 0 06 Oct 2021
Decoupling Speaker-Independent Emotions for Voice Conversion Via Source-Filter Networks Zhaojie Luo Shoufeng Lin Rui Liu Jun Baba Yuichiro Yoshikawa H. Ishiguro 47 9 0 04 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis Cheng-I Jeff Lai Erica Cooper Yang Zhang Shiyu Chang Kaizhi Qian ... Yung-Sung Chuang Alexander H. Liu Junichi Yamagishi David D. Cox James R. Glass 69 6 0 04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech Yi Ren Jinglin Liu Zhou Zhao 137 79 0 30 Sep 2021
VoiceFixer: Toward General Speech Restoration with Neural Vocoder Haohe Liu Qiuqiang Kong Qiao Tian Yan Zhao DeLiang Wang Chuanzeng Huang Yuxuan Wang 87 58 0 28 Sep 2021
Nana-HDR: A Non-attentive Non-autoregressive Hybrid Model for TTS Shilu Lin Wenchao Su Li Meng Fenglong Xie Xinhui Li Li Lu 131 4 0 28 Sep 2021
Exploring Teacher-Student Learning Approach for Multi-lingual Speech-to-Intent Classification Bidisha Sharma Maulik C. Madhavi Xuehao Zhou Haizhou Li 54 2 0 28 Sep 2021