Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.00354
Cited By
JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis
28 October 2017
Ryosuke Sonobe
Shinnosuke Takamichi
Hiroshi Saruwatari
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis"
24 / 24 papers shown
Title
Less is More for Synthetic Speech Detection in the Wild
Ashi Garg
Zexin Cai
Henry Li Xinyuan
Leibny Paola García-Perera
Kevin Duh
Sanjeev Khudanpur
Matthew Wiesner
Nicholas Andrews
74
0
0
17 Feb 2025
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Ji-Hoon Kim
Hong-Sun Yang
Yoon-Cheol Ju
Il-Hwan Kim
Byeong-Yeol Kim
Joon Son Chung
BDL
54
0
0
31 Dec 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
53
3
0
23 Sep 2024
Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing
Hye-jin Shim
Jee-weon Jung
Tomi Kinnunen
21
13
0
31 May 2023
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Seong-Hyun Park
Bohyung Kim
Tae-Hyun Oh
37
1
0
26 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
27
17
0
18 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
25
3
0
09 May 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
34
7
0
06 Mar 2023
Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining
Karol Nowakowski
M. Ptaszynski
Kyoko Murasaki
Jagna Nieuwazny
17
23
0
18 Jan 2023
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Yusuke Yasuda
T. Toda
30
8
0
16 Dec 2022
Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection
Kentaro Seki
Shinnosuke Takamichi
Takaaki Saeki
Hiroshi Saruwatari
23
6
0
26 Oct 2022
Two-stage training method for Japanese electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion
D. Ma
Lester Phillip Violeta
Kazuhiro Kobayashi
T. Toda
21
6
0
19 Oct 2022
SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
31
14
0
12 Oct 2022
Controllable Data Generation by Deep Learning: A Review
Shiyu Wang
Yuanqi Du
Xiaojie Guo
Bo Pan
Zhaohui Qin
Liang Zhao
31
28
0
19 Jul 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
41
23
0
27 Jun 2022
vTTS: visual-text to speech
Yoshifumi Nakano
Takaaki Saeki
Shinnosuke Takamichi
Katsuhito Sudoh
Hiroshi Saruwatari
13
4
0
28 Mar 2022
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
129
123
0
04 Nov 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
50
60
0
15 Oct 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
21
32
0
12 May 2020
Phase reconstruction based on recurrent phase unwrapping with deep neural networks
Yoshiki Masuyama
Kohei Yatabe
Yuma Koizumi
Yasuhiro Oikawa
N. Harada
22
21
0
14 Feb 2020
A Dataset for measuring reading levels in India at scale
Dolly Agarwal
J. Gupchup
Nishant Baghel
14
1
0
27 Nov 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
T. Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
26
201
0
24 Oct 2019
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
H. Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
23
716
0
13 Sep 2019
1