ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11745
  4. Cited By
Rethinking Evaluation in ASR: Are Our Models Robust Enough?

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

22 October 2020
Tatiana Likhomanenko
Qiantong Xu
Vineel Pratap
Paden Tomasello
Jacob Kahn
Gilad Avidov
R. Collobert
Gabriel Synnaeve
ArXivPDFHTML

Papers citing "Rethinking Evaluation in ASR: Are Our Models Robust Enough?"

50 / 63 papers shown
Title
Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning
Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning
Mahmoud Salhab
Marwan Elghitany
Shameed Sait
Syed Sibghat Ullah
Mohammad Abusheikh
Hasan Abusheikh
44
0
0
16 Apr 2025
ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR Personalization
ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR Personalization
Haaris Mehmood
Karthikeyan P. Saravanan
Pablo Peso Parada
David Tuckey
Mete Ozay
Gil Ho Lee
Jungin Lee
Seokyeong Jung
52
0
0
12 Mar 2025
Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech
  Recognition
Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech Recognition
Andrés Piñeiro-Martín
C. García-Mateo
Laura Docío-Fernández
María del Carmen López-Pérez
Georg Rehm
32
3
0
25 Sep 2024
Revisiting Acoustic Features for Robust ASR
Revisiting Acoustic Features for Robust ASR
Muhammad Ahmed Shah
Bhiksha Raj
AAML
16
0
0
24 Sep 2024
Self-Train Before You Transcribe
Self-Train Before You Transcribe
Robert Flynn
Anton Ragni
29
0
0
17 Jun 2024
Test-Time Training for Depression Detection
Test-Time Training for Depression Detection
Sri Harsha Dumpala
Chandramouli Shama Sastry
Rudolf Uher
Sageev Oore
47
0
0
07 Apr 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Muhammad A. Shah
David Solans Noguero
Mikko A. Heikkilä
Nicolas Kourtellis
27
5
0
08 Mar 2024
Masked Audio Generation using a Single Non-Autoregressive Transformer
Masked Audio Generation using a Single Non-Autoregressive Transformer
Alon Ziv
Itai Gat
Gaël Le Lan
Tal Remez
Felix Kreuk
Alexandre Défossez
Jade Copet
Gabriel Synnaeve
Yossi Adi
54
36
0
09 Jan 2024
D4AM: A General Denoising Framework for Downstream Acoustic Models
D4AM: A General Denoising Framework for Downstream Acoustic Models
H. Wang
Yu Tsao
Hsin-Min Wang
Chu-Song Chen
13
4
0
28 Nov 2023
How Much Context Does My Attention-Based ASR System Need?
How Much Context Does My Attention-Based ASR System Need?
Robert Flynn
Anton Ragni
32
1
0
24 Oct 2023
Multi-stage Large Language Model Correction for Speech Recognition
Multi-stage Large Language Model Correction for Speech Recognition
Jie Pu
Thai-Son Nguyen
Sebastian Stüker
LRM
27
6
0
17 Oct 2023
Federated Learning with Differential Privacy for End-to-End Speech
  Recognition
Federated Learning with Differential Privacy for End-to-End Speech Recognition
Martin Pelikan
Sheikh Shams Azam
Vitaly Feldman
Jan Honza Silovsky
Kunal Talwar
Tatiana Likhomanenko
37
8
0
29 Sep 2023
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
Andrew Rouditchenko
R. Collobert
Tatiana Likhomanenko
VLM
27
3
0
29 Sep 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with
  Large Language Models
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
E. Chng
29
42
0
27 Sep 2023
Test-Time Training for Speech
Test-Time Training for Speech
Sri Harsha Dumpala
Chandramouli Shama Sastry
Sageev Oore
39
1
0
19 Sep 2023
Some voices are too common: Building fair speech recognition systems
  using the Common Voice dataset
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Lucas Maison
Yannick Esteve
26
3
0
01 Jun 2023
Towards hate speech detection in low-resource languages: Comparing ASR
  to acoustic word embeddings on Wolof and Swahili
Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
C. Jacobs
Nathanaël Carraz Rakotonirina
E. Chimoto
Bruce A. Bassett
Herman Kamper
17
4
0
01 Jun 2023
Scaling Speech Technology to 1,000+ Languages
Scaling Speech Technology to 1,000+ Languages
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
77
300
0
22 May 2023
Improving Accented Speech Recognition with Multi-Domain Training
Improving Accented Speech Recognition with Multi-Domain Training
Lucas Maison
Yannick Esteve
18
7
0
14 Mar 2023
Stabilizing Transformer Training by Preventing Attention Entropy
  Collapse
Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Shuangfei Zhai
Tatiana Likhomanenko
Etai Littwin
Dan Busbridge
Jason Ramapuram
Yizhe Zhang
Jiatao Gu
J. Susskind
AAML
43
64
0
11 Mar 2023
A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit
A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit
Mina Huh
Ruchira Ray
Corey Karnei
21
3
0
27 Feb 2023
Speech Corpora Divergence Based Unsupervised Data Selection for ASR
Speech Corpora Divergence Based Unsupervised Data Selection for ASR
Changfeng Gao
Gaofeng Cheng
Pengyuan Zhang
Yonghong Yan
11
0
0
26 Feb 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
25
7
0
31 Dec 2022
Pushing the performances of ASR models on English and Spanish accents
Pushing the performances of ASR models on English and Spanish accents
Pooja Chitkara
M. Rivière
Jade Copet
Frank Zhang
Yatharth Saraf
18
0
0
22 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
49
3,290
0
06 Dec 2022
Handling and extracting key entities from customer conversations using
  Speech recognition and Named Entity recognition
Handling and extracting key entities from customer conversations using Speech recognition and Named Entity recognition
Sharvi Endait
Ruturaj Ghatage
DD Kadam
10
2
0
28 Nov 2022
Continuous Soft Pseudo-Labeling in ASR
Continuous Soft Pseudo-Labeling in ASR
Tatiana Likhomanenko
R. Collobert
Navdeep Jaitly
Samy Bengio
VLM
22
3
0
11 Nov 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
30
27
0
24 Oct 2022
Investigating self-supervised, weakly supervised and fully supervised
  training approaches for multi-domain automatic speech recognition: a study on
  Bangladeshi Bangla
Investigating self-supervised, weakly supervised and fully supervised training approaches for multi-domain automatic speech recognition: a study on Bangladeshi Bangla
Ahnaf Mozib Samin
M. Kobir
Md. Mushtaq Shahriyar Rafee
M. F. Ahmed
Mehedi Hasan
Partha Ghosh
Shafkat Kibria
M. S. Rahman
SSL
18
0
0
24 Oct 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
21
3
0
29 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer
  to Unlabeled Modality
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
24
41
0
14 Jul 2022
STOP: A dataset for Spoken Task Oriented Semantic Parsing
STOP: A dataset for Spoken Task Oriented Semantic Parsing
Paden Tomasello
Akshat Shrivastava
Daniel Lazar
Po-Chun Hsu
Duc Le
...
Robin Algayres
Tu Nguyen
Emmanuel Dupoux
Luke Zettlemoyer
Abdel-rahman Mohamed
17
35
0
29 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Sehoon Kim
A. Gholami
Albert Eaton Shaw
Nicholas Lee
K. Mangalam
Jitendra Malik
Michael W. Mahoney
Kurt Keutzer
26
99
0
02 Jun 2022
ASR in German: A Detailed Error Analysis
ASR in German: A Detailed Error Analysis
John M. Wirth
René Peinl
18
5
0
12 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
25
56
0
06 Apr 2022
Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?
Priyanshi Shah
Harveen Singh Chadha
Anirudh Gupta
Ankur Dhuriya
Neeraj Chhimwal
Rishabh Gaur
Vivek Raghavan
15
1
0
30 Mar 2022
Listen, Adapt, Better WER: Source-free Single-utterance Test-time
  Adaptation for Automatic Speech Recognition
Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech Recognition
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
TTA
VLM
13
9
0
27 Mar 2022
Ask2Mask: Guided Data Selection for Masked Speech Modeling
Ask2Mask: Guided Data Selection for Masked Speech Modeling
M. Baskar
Andrew Rosenberg
Bhuvana Ramabhadran
Yu Zhang
Pedro J. Moreno
20
7
0
24 Feb 2022
Korean Tokenization for Beam Search Rescoring in Speech Recognition
Korean Tokenization for Beam Search Rescoring in Speech Recognition
Kyuhong Shim
Hyewon Bae
Wonyong Sung
19
0
0
22 Feb 2022
Flashlight: Enabling Innovation in Tools for Machine Learning
Flashlight: Enabling Innovation in Tools for Machine Learning
Jacob Kahn
Vineel Pratap
Tatiana Likhomanenko
Qiantong Xu
Awni Y. Hannun
...
Gilad Avidov
Benoit Steiner
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
19
28
0
29 Jan 2022
Star Temporal Classification: Sequence Classification with Partially
  Labeled Data
Star Temporal Classification: Sequence Classification with Partially Labeled Data
Vineel Pratap
Awni Y. Hannun
Gabriel Synnaeve
R. Collobert
15
8
0
28 Jan 2022
Are E2E ASR models ready for an industrial usage?
Are E2E ASR models ready for an industrial usage?
Valentin Vielzeuf
G. Antipov
20
8
0
09 Dec 2021
Human-Machine Interaction Speech Corpus from the ROBIN project
Human-Machine Interaction Speech Corpus from the ROBIN project
V. Pais
Radu Ion
Andrei-Marius Avram
Elena Irimia
V. Mititelu
Maria Mitrofan
7
6
0
22 Nov 2021
Scaling ASR Improves Zero and Few Shot Learning
Scaling ASR Improves Zero and Few Shot Learning
Alex Xiao
Weiyi Zheng
Gil Keren
Duc Le
Frank Zhang
Christian Fuegen
Ozlem Kalinli
Yatharth Saraf
Abdel-rahman Mohamed
11
21
0
10 Nov 2021
Pseudo-Labeling for Massively Multilingual Speech Recognition
Pseudo-Labeling for Massively Multilingual Speech Recognition
Loren Lugosch
Tatiana Likhomanenko
Gabriel Synnaeve
R. Collobert
VLM
13
29
0
30 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
95
1,704
0
26 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text
  Joint Pre-Training
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
51
94
0
20 Oct 2021
ASR4REAL: An extended benchmark for speech models
ASR4REAL: An extended benchmark for speech models
M. Rivière
Jade Copet
Gabriel Synnaeve
AuLLM
39
15
0
16 Oct 2021
12
Next