ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.01037
  4. Cited By
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

2 March 2023
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
Zhehuai Chen
Nanxin Chen
Bo-wen Li
Vera Axelrod
Gary Wang
Zhong Meng
Ke Hu
Andrew Rosenberg
Rohit Prabhavalkar
Daniel S. Park
Parisa Haghani
Jason Riesa
Ginger Perng
H. Soltau
Trevor Strohman
Bhuvana Ramabhadran
Tara N. Sainath
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
    VLM
ArXivPDFHTML

Papers citing "Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages"

9 / 9 papers shown
Title
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
Sandipan Dhar
N. D. Jana
Swagatam Das
28
133
0
27 Apr 2025
Kimi-Audio Technical Report
Kimi-Audio Technical Report
KimiTeam
Ding Ding
Zeqian Ju
Yichong Leng
S. Liu
...
Z. Yang
Aoxiong Yin
Ruibin Yuan
Y. Zhang
Zaida Zhou
AuLLM
VLM
87
0
0
25 Apr 2025
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder
  Based Speech-Text Pre-training
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Zi-Hua Zhang
Long Zhou
Junyi Ao
Shujie Liu
Lirong Dai
Jinyu Li
Furu Wei
40
47
0
07 Oct 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
50
183
0
25 May 2022
Input Length Matters: Improving RNN-T and MWER Training for Long-form
  Telephony Speech Recognition
Input Length Matters: Improving RNN-T and MWER Training for Long-form Telephony Speech Recognition
Zhiyun Lu
Yanwei Pan
Thibault Doutre
Parisa Haghani
Liangliang Cao
Rohit Prabhavalkar
C. Zhang
Trevor Strohman
AuLLM
44
13
0
08 Oct 2021
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and
  Accented Speech
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Katrin Tomanek
Vicky Zayats
Dirk Padfield
K. Vaillancourt
Fadi Biadsy
34
43
0
14 Sep 2021
Emformer: Efficient Memory Transformer Based Acoustic Model For Low
  Latency Streaming Speech Recognition
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
37
157
0
21 Oct 2020
Pushing the Limits of Semi-Supervised Learning for Automatic Speech
  Recognition
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
James Qin
Daniel S. Park
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Quoc V. Le
Yonghui Wu
VLM
SSL
107
290
0
20 Oct 2020
Transformer ASR with Contextual Block Processing
Transformer ASR with Contextual Block Processing
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
34
63
0
16 Oct 2019
1