Almost Unsupervised Text to Speech and Automatic Speech Recognition

Almost Unsupervised Text to Speech and Automatic Speech Recognition

13 May 2019

Xu Tan

Zhou Zhao

Papers citing "Almost Unsupervised Text to Speech and Automatic Speech Recognition"

16 / 16 papers shown

Title
A Non-autoregressive Model for Joint STT and TTS Vishal Sunder Brian Kingsbury G. Saon Samuel Thomas Slava Shechtman Hagai Aronowitz Hagai Aronowitz Eric Fosler-Lussier Luis A. Lastras 61 0 0 15 Jan 2025
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data Takaaki Saeki Gary Wang Nobuyuki Morioka Isaac Elias Kyle Kastner ... Andrew Rosenberg Bhuvana Ramabhadran Heiga Zen Francoise Beaufays Hadar Shemtov 38 13 0 29 Feb 2024
A Deliberation-based Joint Acoustic and Text Decoder S. Mavandadi Tara N. Sainath Ke Hu Zelin Wu 18 7 0 23 Mar 2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining Takaaki Saeki Soumi Maiti Xinjian Li Shinji Watanabe Shinnosuke Takamichi Hiroshi Saruwatari 32 17 0 30 Jan 2023
MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline Yifan Hu Pengkai Yin Rui Liu F. Bao Guanglai Gao 11 5 0 22 Sep 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition Ye Du Jie M. Zhang Qiu-shi Zhu Lirong Dai Ming Wu Xin Fang Zhouwang Yang 24 2 0 05 Apr 2022
The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021 Dan Liu Mengge Du Xiaoxi Li Yuchen Hu Lirong Dai 16 20 0 01 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model Chenye Cui Yi Ren Jinglin Liu Feiyang Chen Rongjie Huang Ming Lei Zhou Zhao 21 34 0 17 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 18 24 0 20 Apr 2021
Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction Daria Soboleva Ondrej Skopek Márius vSajgalík Victor Cuarbune Felix Weissenberger ... B. Prisacari Daniel Valcarce Justin Lu Rohit Prabhavalkar Balint Miklos 26 9 0 20 Oct 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition Jin Xu Xu Tan Yi Ren Tao Qin Jian Li Sheng Zhao Tie-Yan Liu VLM 16 90 0 09 Aug 2020
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning Alexander H. Liu Tao Tu Hung-yi Lee Lin-Shan Lee SSL 19 50 0 28 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit Tomoki Hayashi Ryuichi Yamamoto Katsuki Inoue Takenori Yoshimura Shinji Watanabe T. Toda K. Takeda Yu Zhang Xu Tan VLM 16 201 0 24 Oct 2019
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Z. Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 207 819 0 12 Jun 2018
Listening while Speaking: Speech Chain by Deep Learning Andros Tjandra S. Sakti Satoshi Nakamura AuLLM 118 165 0 16 Jul 2017