ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.10296
66
6
v1v2 (latest)

Boosting Objective Scores of Speech Enhancement Model through MetricGAN Post-Processing

18 June 2020
Szu-Wei Fu
Chien-Feng Liao
Tsun-An Hsieh
Kuo-Hsuan Hung
Syu-Siang Wang
Cheng Yu
Heng-Cheng Kuo
Ryandhimas E. Zezario
You-Jin Li
Shang-Yi Chuang
Yen-Ju Lu
Yu Tsao
ArXiv (abs)PDFHTML
Abstract

The Transformer architecture has shown its superior ability than recurrent neural networks on many different natural language processing applications. Therefore, this study applies a modified Transformer on the speech enhancement task. Specifically, the positional encoding may not be necessary and hence is replaced by convolutional layers. To further improve PESQ scores of enhanced speech, the L_1 pre-trained Transformer is fine-tuned by MetricGAN framework. The proposed MetricGAN can be treated as a general post-processing module to further boost interested objective scores. The experiments are conducted using the data sets provided by the organizer of the Deep Noise Suppression (DNS) challenge. Experimental results demonstrate that the proposed system outperforms the challenge baseline in both subjective and objective evaluation with a large margin.

View on arXiv
Comments on this paper