ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.02133
  4. Cited By
SpeechStew: Simply Mix All Available Speech Recognition Data to Train
  One Large Neural Network

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network

5 April 2021
William Chan
Daniel S. Park
Chris A. Lee
Yu Zhang
Quoc V. Le
Mohammad Norouzi
    AI4TS
ArXivPDFHTML

Papers citing "SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network"

22 / 22 papers shown
Title
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
32
2
0
28 Mar 2024
RADIA -- Radio Advertisement Detection with Intelligent Analytics
RADIA -- Radio Advertisement Detection with Intelligent Analytics
Jorge Álvarez
J. C. Armenteros
Camilo Torrón
Miguel Ortega-Martín
Alfonso Ardoiz
...
Íñigo Galdeano
Ignacio Garrido
Adrián Alonso
Fernando Bayón
Oleg Vorontsov
26
0
0
06 Mar 2024
Adaptation of Whisper models to child speech recognition
Adaptation of Whisper models to child speech recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Peter Corcoran
H. Cucu
11
30
0
24 Jul 2023
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Xilin Jiang
Yinghao Aaron Li
N. Mesgarani
CLL
19
1
0
29 May 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African
  Languages
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
23
2
0
22 Mar 2023
Robust Knowledge Distillation from RNN-T Models With Noisy Training
  Labels Using Full-Sum Loss
Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss
Mohammad Zeineldeen
Kartik Audhkhasi
M. Baskar
Bhuvana Ramabhadran
18
2
0
10 Mar 2023
Efficient Domain Adaptation for Speech Foundation Models
Efficient Domain Adaptation for Speech Foundation Models
Bo-wen Li
DongSeon Hwang
Zhouyuan Huo
Junwen Bai
Guru Prakash
...
K. Sim
Yu Zhang
Wei Han
Trevor Strohman
F. Beaufays
AI4CE
28
23
0
03 Feb 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
18
7
0
31 Dec 2022
Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Genshun Wan
Tan Liu
Hang Chen
Jia-Yu Pan
Cong Liu
Z. Ye
SSL
10
0
0
07 Dec 2022
Time-Domain Speech Enhancement for Robust Automatic Speech Recognition
Time-Domain Speech Enhancement for Robust Automatic Speech Recognition
Yufeng Yang
Ashutosh Pandey
DeLiang Wang
16
8
0
24 Oct 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer
  to Unlabeled Modality
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
14
41
0
14 Jul 2022
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech
  recognition
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Zhao You
Shulin Feng
Dan Su
Dong Yu
6
8
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
14
56
0
06 Apr 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
73
1,694
0
26 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text
  Joint Pre-Training
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
51
94
0
20 Oct 2021
Continual learning using lattice-free MMI for speech recognition
Continual learning using lattice-free MMI for speech recognition
Hossein Hadian
Arsenii Gorin
CLL
13
1
0
13 Oct 2021
Spell my name: keyword boosted speech recognition
Spell my name: keyword boosted speech recognition
Namkyu Jung
Geon-min Kim
Joon Son Chung
38
13
0
06 Oct 2021
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets
  Development
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Mingkuan Liu
Chi Zhang
Hua Xing
C. Feng
Mon-Chu Chen
Judith Bishop
Grace Ngapo
13
3
0
01 Sep 2021
A deep convolutional neural network that is invariant to time rescaling
A deep convolutional neural network that is invariant to time rescaling
Brandon G. Jacques
Zoran Tiganj
Aakash Sarkar
Marc W Howard
P. Sederberg
AI4TS
16
7
0
09 Jul 2021
Transformer Language Models with LSTM-based Cross-utterance Information
  Representation
Transformer Language Models with LSTM-based Cross-utterance Information Representation
G. Sun
C. Zhang
P. Woodland
76
32
0
12 Feb 2021
Pushing the Limits of Semi-Supervised Learning for Automatic Speech
  Recognition
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
James Qin
Daniel S. Park
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Quoc V. Le
Yonghui Wu
VLM
SSL
139
308
0
20 Oct 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,453
0
23 Jan 2020
1