How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR

18 January 2022

Papers citing "How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR"

33 / 33 papers shown

Title
Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications Marcus Yu Zhe Wee Justin Juin Hng Wong Lynus Lim Joe Yu Wei Tan Prannaya Gupta Dillion Lim En Hao Tew Aloysius Keng Siew Han Yong Zhi Lim 57 0 0 27 Feb 2025
CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR Nian Shao Rui Zhou Pengyu Wang Xian Li Ying Fang Yujie Yang Xiaofei Li 34 0 0 27 Feb 2025
Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module Zhongjian Cui Chenrui Cui Tianrui Wang Mengnan He Hao Shi Meng Ge Caixia Gong Longbiao Wang J. Dang 31 0 0 05 Jan 2025
Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance M. Milling Shuo Liu Andreas Triantafyllopoulos Ilhan Aslan Björn W. Schuller 21 2 0 12 Aug 2024
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors J. Hauret Malo Olivier Thomas Joubaud C. Langrenne Sarah Poirée V. Zimpfer Éric Bavu 75 1 0 16 Jul 2024
Thunder : Unified Regression-Diffusion Speech Enhancement with a Single Reverse Step using Brownian Bridge Thanapat Trachu Chawan Piansaddhayanon E. Chuangsuwanich 29 2 0 10 Jun 2024
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS Xiaofei Wang Sefik Emre Eskimez Manthan Thakker Hemin Yang Zirun Zhu ... Yufei Xia Jinzhu Li Sheng Zhao Jinyu Li Naoyuki Kanda 40 3 0 09 Jun 2024
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement Wangyou Zhang Kohei Saijo Jee-weon Jung Chenda Li Shinji Watanabe Yanmin Qian 30 4 0 06 Jun 2024
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance Tsubasa Ochiai Kazuma Iwamoto Marc Delcroix Rintaro Ikeshita Hiroshi Sato Shoko Araki Shigeru Katagiri 22 2 0 23 Apr 2024
Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR Yufeng Yang Ashutosh Pandey DeLiang Wang 44 4 0 11 Mar 2024
Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models Rita Frieske Bertram E. Shi 21 6 0 03 Jan 2024
FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition Dongning Yang Wei Wang Yanmin Qian 11 3 0 29 Nov 2023
Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection Cunhang Fan Mingming Ding Jianhua Tao Ruibo Fu Jiangyan Yi Zhengqi Wen Zhao Lv 37 4 0 13 Oct 2023
Toward Universal Speech Enhancement for Diverse Input Conditions Wangyou Zhang Kohei Saijo Zhong-Qiu Wang Shinji Watanabe Yanmin Qian VLM 22 18 0 29 Sep 2023
Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning Zilu Guo Jun Du Chin-Hui Lee 17 0 0 17 Sep 2023
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction Shilong Wu Chenxi Wang Hang Chen Yusheng Dai Chenyue Zhang ... Sabato Marco Siniscalchi O. Scharenborg Zhong-Qiu Wang Jia Pan Jianqing Gao 20 9 0 15 Sep 2023
Naaloss: Rethinking the objective of speech enhancement Kuan-Hsun Ho En Yu J. Hung Berlin Chen 16 2 0 24 Aug 2023
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition Hakan Erdogan Scott Wisdom Xuankai Chang Zalan Borsos Marco Tagliasacchi Neil Zeghidour J. Hershey 11 9 0 21 Aug 2023
Statistical Beamformer Exploiting Non-stationarity and Sparsity with Spatially Constrained ICA for Robust Speech Recognition U.H Shin Hyung-Min Park 11 2 0 13 Jun 2023
speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition Haoyu Lu Nan Li Tongtong Song Longbiao Wang J. Dang Xiaobao Wang Shiliang Zhang NoLa 15 3 0 29 May 2023
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss Hiroshi Sato Ryo Masumura Tsubasa Ochiai Marc Delcroix Takafumi Moriya ... Kentaro Shinayama Saki Mizuno Mana Ihori Tomohiro Tanaka Nobukatsu Hojo 29 5 0 24 May 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR Yuchen Hu Cheng Chen Qiu-shi Zhu E. Chng 18 15 0 11 Apr 2023
Time-domain Speech Enhancement Assisted by Multi-resolution Frequency Encoder and Decoder Hao Shi Masato Mimura Longbiao Wang J. Dang Tatsuya Kawahara 28 13 0 26 Mar 2023
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings Christoph Boeddeker Aswin Shanmugam Subramanian G. Wichern Reinhold Haeb-Umbach Jonathan Le Roux 29 23 0 07 Mar 2023
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks Darius Petermann G. Wichern Aswin Shanmugam Subramanian Zhong-Qiu Wang Jonathan Le Roux 19 10 0 14 Dec 2022
A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training Yang Xiang Jesper Lisby Højvang M. Rasmussen M. G. Christensen DRL 21 5 0 16 Nov 2022
Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts Xiaofei Wang Zhuo Chen Yu Shi Jian Wu Naoyuki Kanda Takuya Yoshioka MoE 19 1 0 11 Nov 2022
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR Qiu-shi Zhu Jie M. Zhang Zitian Zhang Lirong Dai 35 15 0 26 May 2022
Mask scalar prediction for improving robust automatic speech recognition A. Narayanan James Walker S. Panchapagesan N. Howard Yuma Koizumi 11 4 0 26 Apr 2022
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation Xuankai Chang Takashi Maekaku Yuya Fujita Shinji Watanabe VLM 41 45 0 01 Apr 2022
Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge Yen-Ju Lu Samuele Cornell Xuankai Chang Wangyou Zhang Chenda Li Zhaoheng Ni Zhong-Qiu Wang Shinji Watanabe 11 28 0 24 Feb 2022
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition Hiroshi Sato Tsubasa Ochiai Marc Delcroix K. Kinoshita Naoyuki Kamo Takafumi Moriya 25 26 0 11 Jan 2022
SNRi Target Training for Joint Speech Enhancement and Recognition Yuma Koizumi Shigeki Karita A. Narayanan S. Panchapagesan M. Bacchiani 25 14 0 01 Nov 2021