ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.11389
  4. Cited By
The VoiceMOS Challenge 2022
v1v2v3 (latest)

The VoiceMOS Challenge 2022

21 March 2022
Wen-Chin Huang
Erica Cooper
Yu Tsao
Hsin-Min Wang
Tomoki Toda
Junichi Yamagishi
ArXiv (abs)PDFHTML

Papers citing "The VoiceMOS Challenge 2022"

50 / 57 papers shown
Title
TTSOps: A Closed-Loop Corpus Optimization Framework for Training Multi-Speaker TTS Models from Dark Data
TTSOps: A Closed-Loop Corpus Optimization Framework for Training Multi-Speaker TTS Models from Dark Data
Kentaro Seki
Shinnosuke Takamichi
Takaaki Saeki
Hiroshi Saruwatari
31
0
0
18 Jun 2025
Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment
Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment
Wei Wang
Wangyou Zhang
Chenda Li
Jiatong Shi
Shinji Watanabe
Yanmin Qian
5
0
0
13 Jun 2025
SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction
SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction
Saurabh Agrawal
Raj Gohil
Gopal Kumar Agrawal
Vikram C M
Kushal Verma
23
0
0
02 Jun 2025
Uni-VERSA: Versatile Speech Assessment with a Unified Network
Uni-VERSA: Versatile Speech Assessment with a Unified Network
Jiatong Shi
Hye-jin Shim
Shinji Watanabe
31
1
0
27 May 2025
SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit
SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit
Wen-Chin Huang
Erica Cooper
Tomoki Toda
85
1
0
21 May 2025
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features
Yu-Fei Shi
Yang Ai
Ye-Xin Lu
Hui-Peng Du
Zhen-Hua Ling
76
1
0
18 Nov 2024
Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion
Yu-Fei Shi
Yang Ai
Ye-Xin Lu
Hui-Peng Du
Zhen-Hua Ling
61
0
0
17 Nov 2024
MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech
  Quality Assessment Models
MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models
Wen-Chin Huang
Erica Cooper
Tomoki Toda
73
10
0
06 Nov 2024
SCOREQ: Speech Quality Assessment with Contrastive Regression
SCOREQ: Speech Quality Assessment with Contrastive Regression
Alessandro Ragano
Jan Skoglund
Andrew Hines
126
13
0
09 Oct 2024
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Siyin Wang
Wenyi Yu
Yudong Yang
Changli Tang
Yixuan Li
...
Jun Zhang
Guangzhi Sun
Lu Lu
Yuxuan Wang
Chao Zhang
AuLLMLM&MA
133
8
0
25 Sep 2024
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from
  Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic
  Speech
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech
Kaito Baba
Wataru Nakata
Yuki Saito
Hiroshi Saruwatari
VLM
104
17
0
14 Sep 2024
The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Wen-Chin Huang
Szu-Wei Fu
Erica Cooper
Ryandhimas E. Zezario
Tomoki Toda
Hsin-Min Wang
Junichi Yamagishi
Yu Tsao
81
12
0
11 Sep 2024
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with
  Adversarial Conditional Diffusion Distillation
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Yuto Kondo
DiffM
67
2
0
03 Sep 2024
TTSDS -- Text-to-Speech Distribution Score
TTSDS -- Text-to-Speech Distribution Score
Christoph Minixhofer
Ondˇrej Klejch
Peter Bell
78
0
0
17 Jul 2024
GLOBE: A High-quality English Corpus with Global Accents for Zero-shot
  Speaker Adaptive Text-to-Speech
GLOBE: A High-quality English Corpus with Global Accents for Zero-shot Speaker Adaptive Text-to-Speech
Wenbin Wang
Yang Song
Sanjay Jha
104
10
0
21 Jun 2024
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS
  Prediction
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
Yuxun Tang
Jiatong Shi
Yuning Wu
Qin Jin
77
11
0
16 Jun 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang
Jiatong Shi
Jinchuan Tian
Yuning Wu
Yuxun Tang
Yihan Wu
Shinji Watanabe
Yossi Adi
Xie Chen
Qin Jin
108
20
0
11 Jun 2024
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
Wenbin Wang
Yang Song
Sanjay Jha
75
12
0
28 Apr 2024
Training Generative Adversarial Network-Based Vocoder with Limited Data
  Using Augmentation-Conditional Discriminator
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
50
0
0
25 Mar 2024
A Collection of Pragmatic-Similarity Judgments over Spoken Dialog
  Utterances
A Collection of Pragmatic-Similarity Judgments over Spoken Dialog Utterances
Nigel G. Ward
Divette Marco
54
5
0
21 Mar 2024
Automatic design optimization of preference-based subjective evaluation
  with online learning in crowdsourcing environment
Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment
Yusuke Yasuda
Tomoki Toda
61
1
0
10 Mar 2024
Self-Supervised Speech Quality Estimation and Enhancement Using Only
  Clean Speech
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
Szu-Wei Fu
Kuo-Hsuan Hung
Yu Tsao
Yu-Chiang Frank Wang
SSL
71
13
0
26 Feb 2024
PAM: Prompting Audio-Language Models for Audio Quality Assessment
PAM: Prompting Audio-Language Models for Audio Quality Assessment
Soham Deshmukh
Dareen Alharthi
Benjamin Elizalde
Hannes Gamper
Mahmoud Al Ismail
Rita Singh
Bhiksha Raj
Huaming Wang
96
13
0
01 Feb 2024
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech
  Generation Leveraging NLP Evaluation Metrics
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics
Takaaki Saeki
Soumi Maiti
Shinnosuke Takamichi
Shinji Watanabe
Hiroshi Saruwatari
87
27
0
30 Jan 2024
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Hsin-Tien Chiang
Szu-Wei Fu
Hsin-Min Wang
Yu Tsao
John H. L. Hansen
66
4
0
15 Nov 2023
On the Relevance of Phoneme Duration Variability of Synthesized Training
  Data for Automatic Speech Recognition
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Nick Rossenbach
Benedikt Hilmes
Ralf Schluter
54
3
0
12 Oct 2023
Partial Rank Similarity Minimization Method for Quality MOS Prediction
  of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting
Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting
Hemant Yadav
Erica Cooper
Junichi Yamagishi
Sunayana Sitaram
R. Shah
66
0
0
08 Oct 2023
Diversity-based core-set selection for text-to-speech with linguistic
  and acoustic features
Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Kentaro Seki
Shinnosuke Takamichi
Takaaki Saeki
Hiroshi Saruwatari
71
3
0
15 Sep 2023
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Wen-Chin Huang
Tomoki Toda
CVBM
94
5
0
05 Sep 2023
RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic
  Weighting
RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Haibo Wang
Shiwan Zhao
Xiguang Zheng
Yong Qin
73
13
0
31 Aug 2023
Sparks of Large Audio Models: A Survey and Outlook
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Min Zhang
Björn W. Schuller
LM&MAAuLLM
183
39
0
24 Aug 2023
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality
  Assessment Model
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
Ryandhimas E. Zezario
B. Bai
C. Fuh
Hsin-Min Wang
Yu Tsao
51
4
0
18 Aug 2023
On the Use of Self-Supervised Speech Representations in Spontaneous
  Speech Synthesis
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis
Siyang Wang
G. Henter
Joakim Gustafson
Éva Székely
73
6
0
11 Jul 2023
Disentanglement in a GAN for Unconditional Speech Synthesis
Disentanglement in a GAN for Unconditional Speech Synthesis
Matthew Baas
Herman Kamper
DiffM
70
4
0
04 Jul 2023
Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone
  Mapping, Features Input, and Source Language Selection
Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection
P. Do
Matt Coler
J. Dijkstra
E. Klabbers
29
3
0
21 Jun 2023
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer
  Learning for Low-Resource Text-to-Speech
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
P. Do
Matt Coler
J. Dijkstra
E. Klabbers
OffRL
51
0
0
01 Jun 2023
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction
  in Text-to-Speech for Low-Resource Languages
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages
P. Do
Matt Coler
J. Dijkstra
E. Klabbers
45
4
0
30 May 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
Ondvrej Plátek
Ondrej Dusek
56
2
0
17 Jan 2023
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Simbarashe Nyatsanga
Taras Kucherenko
Chaitanya Ahuja
G. Henter
Michael Neff
SLR
105
93
0
13 Jan 2023
SpeechLMScore: Evaluating speech generation using speech language model
SpeechLMScore: Evaluating speech generation using speech language model
Soumi Maiti
Yifan Peng
Takaaki Saeki
Shinji Watanabe
ALM
75
32
0
08 Dec 2022
Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using
  Prosodic and Linguistic Features
Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Alexandra Vioni
Georgia Maniati
Nikolaos Ellinas
June Sig Sung
Inchul Hwang
Aimilios Chalamandaris
Pirros Tsiakoulis
95
5
0
01 Nov 2022
Text-to-speech synthesis from dark data with evaluation-in-the-loop data
  selection
Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection
Kentaro Seki
Shinnosuke Takamichi
Takaaki Saeki
Hiroshi Saruwatari
95
7
0
26 Oct 2022
SQuId: Measuring Speech Naturalness in Many Languages
SQuId: Measuring Speech Naturalness in Many Languages
Thibault Sellam
Ankur Bapna
Joshua Camp
Diana Mackinnon
Ankur P. Parikh
Jason Riesa
74
18
0
12 Oct 2022
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from
  Diffusion Models
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models
Matthew Baas
Herman Kamper
DiffM
86
8
0
11 Oct 2022
Predicting pairwise preferences between TTS audio stimuli using parallel
  ratings data and anti-symmetric twin neural networks
Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Cassia Valentini-Botinhao
M. Ribeiro
O. Watts
Korin Richmond
G. Henter
32
1
0
22 Sep 2022
Using Rater and System Metadata to Explain Variance in the VoiceMOS
  Challenge 2022 Dataset
Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Michael Chinen
Jan Skoglund
Chandan K. A. Reddy
Alessandro Ragano
Andrew Hines
25
9
0
14 Sep 2022
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Kyle Kastner
Aaron Courville
42
0
0
30 Jun 2022
Comparison of Speech Representations for the MOS Prediction System
Comparison of Speech Representations for the MOS Prediction System
A. Kunikoshi
Jaebok Kim
Won-Suk Jun
K. Sjölander
25
1
0
28 Jun 2022
Speech Quality Assessment through MOS using Non-Matching References
Speech Quality Assessment through MOS using Non-Matching References
Pranay Manocha
Anurag Kumar
139
28
0
24 Jun 2022
Improving Self-Supervised Learning-based MOS Prediction Networks
Improving Self-Supervised Learning-based MOS Prediction Networks
Bálint Gyires-Tóth
Csaba Zainkó
SSL
38
1
0
23 Apr 2022
12
Next