JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis

28 October 2017

Hiroshi Saruwatari

Papers citing "JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis"

24 / 24 papers shown

Title
Less is More for Synthetic Speech Detection in the Wild Ashi Garg Zexin Cai Henry Li Xinyuan Leibny Paola García-Perera Kevin Duh Sanjeev Khudanpur Matthew Wiesner Nicholas Andrews 74 0 0 17 Feb 2025
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation Ji-Hoon Kim Hong-Sun Yang Yoon-Cheol Ju Il-Hwan Kim Byeong-Yeol Kim Joon Son Chung BDL 54 0 0 31 Dec 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection Lam Pham Phat Lam Dat Tran Hieu Tang Tin Nguyen Alexander Schindler Canh Vu Alexander Polonsky Canh Vu 56 3 0 23 Sep 2024
Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing Hye-jin Shim Jee-weon Jung Tomi Kinnunen 21 13 0 31 May 2023
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis Seong-Hyun Park Bohyung Kim Tae-Hyun Oh 37 1 0 26 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks Yifan Peng Kwangyoun Kim Felix Wu Brian Yan Siddhant Arora William Chen Jiyang Tang Suwon Shon Prashant Sridhar Shinji Watanabe 27 17 0 18 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models Takanori Ashihara Takafumi Moriya Kohei Matsuura Tomohiro Tanaka 25 3 0 09 May 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model Rui Xue Yanqing Liu Lei He Xuejiao Tan Linquan Liu Ed Lin Sheng Zhao 34 7 0 06 Mar 2023
Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining Karol Nowakowski M. Ptaszynski Kyoko Murasaki Jagna Nieuwazny 20 23 0 18 Jan 2023
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Yusuke Yasuda T. Toda 33 8 0 16 Dec 2022
Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection Kentaro Seki Shinnosuke Takamichi Takaaki Saeki Hiroshi Saruwatari 23 6 0 26 Oct 2022
Two-stage training method for Japanese electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion D. Ma Lester Phillip Violeta Kazuhiro Kobayashi T. Toda 21 6 0 19 Oct 2022
SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection Piotr Kawa Marcin Plata P. Syga 34 14 0 12 Oct 2022
Controllable Data Generation by Deep Learning: A Review Shiyu Wang Yuanqi Du Xiaojie Guo Bo Pan Zhaohui Qin Liang Zhao 31 28 0 19 Jul 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection Piotr Kawa Marcin Plata P. Syga AAML 41 23 0 27 Jun 2022
vTTS: visual-text to speech Yoshifumi Nakano Takaaki Saeki Shinnosuke Takamichi Katsuhito Sudoh Hiroshi Saruwatari 13 4 0 28 Mar 2022
WaveFake: A Data Set to Facilitate Audio Deepfake Detection Joel Frank Lea Schonherr DiffM 129 123 0 04 Nov 2021
ESPnet2-TTS: Extending the Edge of TTS Research Tomoki Hayashi Ryuichi Yamamoto Takenori Yoshimura Peter Wu Jiatong Shi Takaaki Saeki Yooncheol Ju Yusuke Yasuda Shinnosuke Takamichi Shinji Watanabe VLM 50 60 0 15 Oct 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
DiscreTalk: Text-to-Speech as a Machine Translation Problem Tomoki Hayashi Shinji Watanabe 24 32 0 12 May 2020
Phase reconstruction based on recurrent phase unwrapping with deep neural networks Yoshiki Masuyama Kohei Yatabe Yuma Koizumi Yasuhiro Oikawa N. Harada 22 21 0 14 Feb 2020
A Dataset for measuring reading levels in India at scale Dolly Agarwal J. Gupchup Nishant Baghel 16 1 0 27 Nov 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit Tomoki Hayashi Ryuichi Yamamoto Katsuki Inoue Takenori Yoshimura Shinji Watanabe T. Toda K. Takeda Yu Zhang Xu Tan VLM 29 201 0 24 Oct 2019
A Comparative Study on Transformer vs RNN in Speech Applications Shigeki Karita Nanxin Chen Tomoki Hayashi Takaaki Hori Hirofumi Inaguma ... Ryuichi Yamamoto Xiao-fei Wang Shinji Watanabe Takenori Yoshimura Wangyou Zhang 25 716 0 13 Sep 2019