v1v2v3 (latest)

The VoiceMOS Challenge 2022

21 March 2022

Papers citing "The VoiceMOS Challenge 2022"

50 / 57 papers shown

Title
TTSOps: A Closed-Loop Corpus Optimization Framework for Training Multi-Speaker TTS Models from Dark Data Kentaro Seki Shinnosuke Takamichi Takaaki Saeki Hiroshi Saruwatari 31 0 0 18 Jun 2025
Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment Wei Wang Wangyou Zhang Chenda Li Jiatong Shi Shinji Watanabe Yanmin Qian 5 0 0 13 Jun 2025
SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction Saurabh Agrawal Raj Gohil Gopal Kumar Agrawal Vikram C M Kushal Verma 23 0 0 02 Jun 2025
Uni-VERSA: Versatile Speech Assessment with a Unified Network Jiatong Shi Hye-jin Shim Shinji Watanabe 31 1 0 27 May 2025
SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit Wen-Chin Huang Erica Cooper Tomoki Toda 85 1 0 21 May 2025
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features Yu-Fei Shi Yang Ai Ye-Xin Lu Hui-Peng Du Zhen-Hua Ling 76 1 0 18 Nov 2024
Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion Yu-Fei Shi Yang Ai Ye-Xin Lu Hui-Peng Du Zhen-Hua Ling 61 0 0 17 Nov 2024
MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models Wen-Chin Huang Erica Cooper Tomoki Toda 73 10 0 06 Nov 2024
SCOREQ: Speech Quality Assessment with Contrastive Regression Alessandro Ragano Jan Skoglund Andrew Hines 126 13 0 09 Oct 2024
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation Siyin Wang Wenyi Yu Yudong Yang Changli Tang Yixuan Li ... Jun Zhang Guangzhi Sun Lu Lu Yuxuan Wang Chao Zhang AuLLM LM&MA 133 8 0 25 Sep 2024
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech Kaito Baba Wataru Nakata Yuki Saito Hiroshi Saruwatari VLM 104 17 0 14 Sep 2024
The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction Wen-Chin Huang Szu-Wei Fu Erica Cooper Ryandhimas E. Zezario Tomoki Toda Hsin-Min Wang Junichi Yamagishi Yu Tsao 81 12 0 11 Sep 2024
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Yuto Kondo DiffM 67 2 0 03 Sep 2024
TTSDS -- Text-to-Speech Distribution Score Christoph Minixhofer Ondˇrej Klejch Peter Bell 78 0 0 17 Jul 2024
GLOBE: A High-quality English Corpus with Global Accents for Zero-shot Speaker Adaptive Text-to-Speech Wenbin Wang Yang Song Sanjay Jha 104 10 0 21 Jun 2024
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction Yuxun Tang Jiatong Shi Yuning Wu Qin Jin 77 11 0 16 Jun 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units Xuankai Chang Jiatong Shi Jinchuan Tian Yuning Wu Yuxun Tang Yihan Wu Shinji Watanabe Yossi Adi Xie Chen Qin Jin 108 20 0 11 Jun 2024
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach Wenbin Wang Yang Song Sanjay Jha 75 12 0 28 Apr 2024
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka 50 0 0 25 Mar 2024
A Collection of Pragmatic-Similarity Judgments over Spoken Dialog Utterances Nigel G. Ward Divette Marco 54 5 0 21 Mar 2024
Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment Yusuke Yasuda Tomoki Toda 61 1 0 10 Mar 2024
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech Szu-Wei Fu Kuo-Hsuan Hung Yu Tsao Yu-Chiang Frank Wang SSL 71 13 0 26 Feb 2024
PAM: Prompting Audio-Language Models for Audio Quality Assessment Soham Deshmukh Dareen Alharthi Benjamin Elizalde Hannes Gamper Mahmoud Al Ismail Rita Singh Bhiksha Raj Huaming Wang 96 13 0 01 Feb 2024
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics Takaaki Saeki Soumi Maiti Shinnosuke Takamichi Shinji Watanabe Hiroshi Saruwatari 87 27 0 30 Jan 2024
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model Hsin-Tien Chiang Szu-Wei Fu Hsin-Min Wang Yu Tsao John H. L. Hansen 66 4 0 15 Nov 2023
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition Nick Rossenbach Benedikt Hilmes Ralf Schluter 54 3 0 12 Oct 2023
Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting Hemant Yadav Erica Cooper Junichi Yamagishi Sunayana Sitaram R. Shah 66 0 0 08 Oct 2023
Diversity-based core-set selection for text-to-speech with linguistic and acoustic features Kentaro Seki Shinnosuke Takamichi Takaaki Saeki Hiroshi Saruwatari 71 3 0 15 Sep 2023
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion Wen-Chin Huang Tomoki Toda CVBM 94 5 0 05 Sep 2023
RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting Haibo Wang Shiwan Zhao Xiguang Zheng Yong Qin 73 13 0 31 Aug 2023
Sparks of Large Audio Models: A Survey and Outlook S. Latif Moazzam Shoukat Fahad Shamshad Muhammad Usama Yi Ren ... Wenwu Wang Xulong Zhang Roberto Togneri Min Zhang Björn W. Schuller LM&MA AuLLM 183 39 0 24 Aug 2023
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model Ryandhimas E. Zezario B. Bai C. Fuh Hsin-Min Wang Yu Tsao 51 4 0 18 Aug 2023
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis Siyang Wang G. Henter Joakim Gustafson Éva Székely 73 6 0 11 Jul 2023
Disentanglement in a GAN for Unconditional Speech Synthesis Matthew Baas Herman Kamper DiffM 70 4 0 04 Jul 2023
Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection P. Do Matt Coler J. Dijkstra E. Klabbers 29 3 0 21 Jun 2023
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech P. Do Matt Coler J. Dijkstra E. Klabbers OffRL 51 0 0 01 Jun 2023
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages P. Do Matt Coler J. Dijkstra E. Klabbers 45 4 0 30 May 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module Ondvrej Plátek Ondrej Dusek 56 2 0 17 Jan 2023
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation Simbarashe Nyatsanga Taras Kucherenko Chaitanya Ahuja G. Henter Michael Neff SLR 105 93 0 13 Jan 2023
SpeechLMScore: Evaluating speech generation using speech language model Soumi Maiti Yifan Peng Takaaki Saeki Shinji Watanabe ALM 75 32 0 08 Dec 2022
Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features Alexandra Vioni Georgia Maniati Nikolaos Ellinas June Sig Sung Inchul Hwang Aimilios Chalamandaris Pirros Tsiakoulis 95 5 0 01 Nov 2022
Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection Kentaro Seki Shinnosuke Takamichi Takaaki Saeki Hiroshi Saruwatari 95 7 0 26 Oct 2022
SQuId: Measuring Speech Naturalness in Many Languages Thibault Sellam Ankur Bapna Joshua Camp Diana Mackinnon Ankur P. Parikh Jason Riesa 74 18 0 12 Oct 2022
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models Matthew Baas Herman Kamper DiffM 86 8 0 11 Oct 2022
Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks Cassia Valentini-Botinhao M. Ribeiro O. Watts Korin Richmond G. Henter 32 1 0 22 Sep 2022
Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset Michael Chinen Jan Skoglund Chandan K. A. Reddy Alessandro Ragano Andrew Hines 25 9 0 14 Sep 2022
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Kyle Kastner Aaron Courville 42 0 0 30 Jun 2022
Comparison of Speech Representations for the MOS Prediction System A. Kunikoshi Jaebok Kim Won-Suk Jun K. Sjölander 25 1 0 28 Jun 2022
Speech Quality Assessment through MOS using Non-Matching References Pranay Manocha Anurag Kumar 139 28 0 24 Jun 2022
Improving Self-Supervised Learning-based MOS Prediction Networks Bálint Gyires-Tóth Csaba Zainkó SSL 38 1 0 23 Apr 2022