Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.08352
Cited By
v1
v2
v3 (latest)
MOSNet: Deep Learning based Objective Assessment for Voice Conversion
17 April 2019
Chen-Chou Lo
Szu-Wei Fu
Wen-Chin Huang
Xin Wang
Junichi Yamagishi
Yu Tsao
H. Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
50 / 140 papers shown
Title
BWSNet: Automatic Perceptual Assessment of Audio Signals
Clément Le Moine Veillon
Victor Rosi
Pablo Arias Sarah
Léane Salais
Nicolas Obin
41
0
0
05 Sep 2023
Timbre-reserved Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Li Zhang
Pengcheng Guo
Linfu Xie
AAML
79
4
0
02 Sep 2023
RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Haibo Wang
Shiwan Zhao
Xiguang Zheng
Yong Qin
77
13
0
31 Aug 2023
The Effect of Spoken Language on Speech Enhancement using Self-Supervised Speech Representation Loss Functions
George Close
Thomas Hain
Stefan Goetze
67
8
0
27 Jul 2023
Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation
Zhonghua Liu
Shijun Wang
Ning Chen
DRL
65
2
0
21 Jun 2023
Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection
P. Do
Matt Coler
J. Dijkstra
E. Klabbers
37
3
0
21 Jun 2023
MOSPC: MOS Prediction Based on Pairwise Comparison
Kexin Wang
Yunlong Zhao
Qianqian Dong
Tom Ko
Mingxuan Wang
56
6
0
18 Jun 2023
Evaluation of Speech Representations for MOS prediction
F. S. Oliveira
Edresson Casanova
Arnaldo Cândido Júnior
L. Gris
A. S. Soares
A. R. G. Filho
56
4
0
16 Jun 2023
Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
Massa Baali
Ibrahim Almakky
Shady Shehata
Fakhri Karray
69
3
0
07 Jun 2023
ALO-VC: Any-to-any Low-latency One-shot Voice Conversion
Bo Wang
Damien Ronssin
Milos Cernak
BDL
84
3
0
01 Jun 2023
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages
P. Do
Matt Coler
J. Dijkstra
E. Klabbers
45
4
0
30 May 2023
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Ziqian Wang
Pengcheng Guo
Linfu Xie
AAML
64
1
0
30 May 2023
Translatotron 3: Speech to Speech Translation with Monolingual Data
Eliya Nachmani
Alon Levkovitch
Yi-Yang Ding
Chulayutsh Asawaroengchai
Heiga Zen
Michelle Tadmor Ramanovich
91
15
0
27 May 2023
A multimodal dynamical variational autoencoder for audiovisual speech representation learning
Samir Sadok
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
Renaud Séguier
102
13
0
05 May 2023
An investigation into the adaptability of a diffusion-based TTS model
Haolin Chen
Philip N. Garner
DiffM
66
1
0
03 Mar 2023
Visual Realism Assessment for Face-swap Videos
Xianyun Sun
Bei Dong
Caiyong Wang
Bo Peng
Jing Dong
CVBM
39
3
0
02 Feb 2023
Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study
Massa Baali
Tomoki Hayashi
Hamdy Mubarak
Soumi Maiti
Shinji Watanabe
W. El-Hajj
Ahmed M. Ali
47
11
0
22 Jan 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
Ondvrej Plátek
Ondrej Dusek
56
2
0
17 Jan 2023
SpeechLMScore: Evaluating speech generation using speech language model
Soumi Maiti
Yifan Peng
Takaaki Saeki
Shinji Watanabe
ALM
78
32
0
08 Dec 2022
On the robustness of non-intrusive speech quality model by adversarial examples
Hsin-Yi Lin
Huan-Hsin Tseng
Yu Tsao
AAML
59
3
0
11 Nov 2022
CCATMos: Convolutional Context-aware Transformer Network for Non-intrusive Speech Quality Assessment
Yuchen Liu
Li-Chia Yang
Alex Pawlicki
Marko Stamenovic
48
5
0
04 Nov 2022
Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Alexandra Vioni
Georgia Maniati
Nikolaos Ellinas
June Sig Sung
Inchul Hwang
Aimilios Chalamandaris
Pirros Tsiakoulis
95
5
0
01 Nov 2022
RedPen: Region- and Reason-Annotated Dataset of Unnatural Speech
Kyumin Park
Keon Lee
Daeyoung Kim
Dongyeop Kang
52
0
0
26 Oct 2022
Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar
Aolan Sun
Xulong Zhang
Tiandong Ling
Jianzong Wang
Ning Cheng
Jing Xiao
52
4
0
13 Oct 2022
Can we use Common Voice to train a Multi-Speaker TTS system?
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
83
10
0
12 Oct 2022
SQuId: Measuring Speech Naturalness in Many Languages
Thibault Sellam
Ankur Bapna
Joshua Camp
Diana Mackinnon
Ankur P. Parikh
Jason Riesa
83
18
0
12 Oct 2022
ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Xuechen Liu
Xin Wang
Md. Sahidullah
J. Patino
Héctor Delgado
...
Massimiliano Todisco
Junichi Yamagishi
Nicholas W. D. Evans
A. Nautsch
Kong Aik Lee
116
194
0
05 Oct 2022
Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Cassia Valentini-Botinhao
M. Ribeiro
O. Watts
Korin Richmond
G. Henter
32
2
0
22 Sep 2022
Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Michael Chinen
Jan Skoglund
Chandan K. A. Reddy
Alessandro Ragano
Andrew Hines
32
9
0
14 Sep 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review
Enes ALTUNCU
V. N. Franqueira
Shujun Li
145
13
0
21 Aug 2022
Enhancing Audio Perception of Music By AI Picked Room Acoustics
Prateek Verma
J. Berger
52
0
0
16 Aug 2022
Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion
Jianchun Ma
Zhedong Zheng
Hao Fei
Feng Zheng
Tat-Seng Chua
Yi Yang
GAN
61
0
0
13 Jul 2022
Automatic Evaluation of Speaker Similarity
Kamil Deja
Ariadna Sánchez
Julian Roth
Marius Cotescu
48
6
0
01 Jul 2022
Comparison of Speech Representations for the MOS Prediction System
A. Kunikoshi
Jaebok Kim
Won-Suk Jun
K. Sjölander
30
1
0
28 Jun 2022
Audio Similarity is Unreliable as a Proxy for Audio Quality
Pranay Manocha
Zeyu Jin
Adam Finkelstein
62
10
0
27 Jun 2022
Wideband Audio Waveform Evaluation Networks: Efficient, Accurate Estimation of Speech Qualities
Andrew A. Catellier
S. Voran
58
3
0
27 Jun 2022
Speech Quality Assessment through MOS using Non-Matching References
Pranay Manocha
Anurag Kumar
139
28
0
24 Jun 2022
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning
Rui Liu
Berrak Sisman
Björn Schuller
Guanglai Gao
Haizhou Li
92
11
0
15 Jun 2022
Few-Shot Audio-Visual Learning of Environment Acoustics
Sagnik Majumder
Changan Chen
Ziad Al-Halah
Kristen Grauman
86
55
0
08 Jun 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
141
221
0
09 May 2022
Improving Self-Supervised Learning-based MOS Prediction Networks
Bálint Gyires-Tóth
Csaba Zainkó
SSL
38
1
0
23 Apr 2022
LibriS2S: A German-English Speech-to-Speech Translation Corpus
Pedro Jeuris
Jan Niehues
AuLLM
29
3
0
22 Apr 2022
Fusion of Self-supervised Learned Models for MOS Prediction
Zhengdong Yang
Wangjin Zhou
Chenhui Chu
Sheng Li
Raj Dabre
Raphaël Rubino
Yi Zhao
63
29
0
11 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
56
0
0
08 Apr 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion
Weida Liang
Lantian Li
Wenqiang Du
Dong Wang
122
0
0
08 Apr 2022
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
73
21
0
07 Apr 2022
SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
Georgia Maniati
Alexandra Vioni
Nikolaos Ellinas
Karolos Nikitaras
Konstantinos Klapsas
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
75
28
0
06 Apr 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Takaaki Saeki
Detai Xin
Wataru Nakata
Tomoki Koriyama
Shinnosuke Takamichi
Hiroshi Saruwatari
133
217
0
05 Apr 2022
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE
Ziang Long
Yunling Zheng
Meng Yu
Jack Xin
DRL
63
5
0
30 Mar 2022
ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications
Gaoxiong Yi
Wei Xiao
Yiming Xiao
Babak Naderi
Sebastian Möller
...
Z. Zhang
Donald Williamson
Fei Chen
Fuzheng Yang
Shidong Shang
90
49
0
30 Mar 2022
Previous
1
2
3
Next