ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.12710
  4. Cited By
A Streaming On-Device End-to-End Model Surpassing Server-Side
  Conventional Model Quality and Latency

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

28 March 2020
Tara N. Sainath
Yanzhang He
Bo-wen Li
A. Narayanan
Ruoming Pang
A. Bruguier
Shuo-yiin Chang
Wei Li
R. Álvarez
Z. Chen
Chung-Cheng Chiu
David Garcia
A. Gruenstein
Ke Hu
Minho Jin
Anjuli Kannan
Qiao Liang
Ian McGraw
Cal Peyser
Rohit Prabhavalkar
Golan Pundak
David Rybach
Yuan Shangguan
Y. Sheth
Trevor Strohman
Mirkó Visontai
Yonghui Wu
Yu Zhang
Ding Zhao
ArXivPDFHTML

Papers citing "A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency"

43 / 43 papers shown
Title
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
91
2
0
09 Jul 2024
Text Injection for Neural Contextual Biasing
Text Injection for Neural Contextual Biasing
Zhong Meng
Zelin Wu
Rohit Prabhavalkar
Cal Peyser
Weiran Wang
Nanxin Chen
Tara N. Sainath
Bhuvana Ramabhadran
25
3
0
05 Jun 2024
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech
  Recognition
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
Tian-Hao Zhang
Dinghao Zhou
Guiping Zhong
Jiaming Zhou
Baoxiang Li
20
3
0
26 Jul 2023
Reducing the gap between streaming and non-streaming Transducer-based
  ASR by adaptive two-stage knowledge distillation
Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation
Haitao Tang
Yu Fu
Lei Sun
Jiabin Xue
Dan Liu
...
Zhiqiang Ma
Minghui Wu
Jia Pan
Genshun Wan
Ming’En Zhao
21
2
0
27 Jun 2023
Text-only Domain Adaptation using Unified Speech-Text Representation in
  Transducer
Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Lu Huang
Yangqiu Song
Jun Zhang
Lu Lu
Zejun Ma
29
2
0
07 Jun 2023
External Language Model Integration for Factorized Neural Transducers
External Language Model Integration for Factorized Neural Transducers
Michael Levit
S. Parthasarathy
Cem Aksoylar
Mohammad Sadegh Rasooli
Shuangyu Chang
29
2
0
26 May 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for
  Mandarin Speech Recognition
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
13
0
0
23 Mar 2023
Minimum Latency Training of Sequence Transducers for Streaming
  End-to-End Speech Recognition
Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
Yusuke Shinohara
Shinji Watanabe
AI4TS
21
9
0
04 Nov 2022
Internal Language Model Estimation based Adaptive Language Model Fusion
  for Domain Adaptation
Internal Language Model Estimation based Adaptive Language Model Fusion for Domain Adaptation
Rao Ma
Xiaobo Wu
Jin Qiu
Yanan Qin
Haihua Xu
Peihao Wu
Zejun Ma
29
2
0
02 Nov 2022
Fast and parallel decoding for transducer
Fast and parallel decoding for transducer
Wei Kang
Liyong Guo
Fangjun Kuang
Long Lin
Mingshuang Luo
Zengwei Yao
Xiaoyu Yang
Piotr Żelasko
Daniel Povey
AI4TS
19
15
0
31 Oct 2022
Monotonic segmental attention for automatic speech recognition
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
16
8
0
26 Oct 2022
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale
  ASR
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
DongSeon Hwang
K. Sim
Yu Zhang
Trevor Strohman
14
10
0
11 Oct 2022
A Universally-Deployable ASR Frontend for Joint Acoustic Echo
  Cancellation, Speech Enhancement, and Voice Separation
A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Tom O'Malley
A. Narayanan
Quan Wang
19
5
0
14 Sep 2022
UserLibri: A Dataset for ASR Personalization Using Only Text
UserLibri: A Dataset for ASR Personalization Using Only Text
Theresa Breiner
Swaroop Indra Ramaswamy
Ehsan Variani
Shefali Garg
Rajiv Mathews
K. Sim
Kilol Gupta
Mingqing Chen
Lara McConnaughey
30
16
0
02 Jul 2022
On Comparison of Encoders for Attention based End to End Speech
  Recognition in Standalone and Rescoring Mode
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
30
2
0
26 Jun 2022
A Conformer-based Waveform-domain Neural Acoustic Echo Canceller
  Optimized for ASR Accuracy
A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy
S. Panchapagesan
A. Narayanan
T. Shabestary
Shuai Shao
N. Howard
Alex Park
James Walker
A. Gruenstein
16
3
0
06 May 2022
Mask scalar prediction for improving robust automatic speech recognition
Mask scalar prediction for improving robust automatic speech recognition
A. Narayanan
James Walker
S. Panchapagesan
N. Howard
Yuma Koizumi
11
4
0
26 Apr 2022
Large-Scale Streaming End-to-End Speech Translation with Neural
  Transducers
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
24
26
0
11 Apr 2022
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for
  On-Device Speech Recognition
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Shaojin Ding
R. Rikhye
Qiao Liang
Yanzhang He
Quan Wang
A. Narayanan
Tom O'Malley
Ian McGraw
21
26
0
08 Apr 2022
4-bit Conformer with Native Quantization Aware Training for Speech
  Recognition
4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Shaojin Ding
Phoenix Meadowlark
Yanzhang He
Lukasz Lew
Shivani Agrawal
Oleg Rybakov
MQ
31
32
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
23
93
0
29 Mar 2022
Efficient Softmax Approximation for Deep Neural Networks with Attention
  Mechanism
Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism
Ihor Vasyltsov
Wooseok Chang
25
12
0
21 Nov 2021
Cross-attention conformer for context modeling in speech enhancement for
  ASR
Cross-attention conformer for context modeling in speech enhancement for ASR
A. Narayanan
Chung-Cheng Chiu
Tom O'Malley
Quan Wang
Yanzhang He
24
14
0
30 Oct 2021
Data-Driven Offline Optimization For Architecting Hardware Accelerators
Data-Driven Offline Optimization For Architecting Hardware Accelerators
Aviral Kumar
Amir Yazdanbakhsh
Milad Hashemi
Kevin Swersky
Sergey Levine
27
36
0
20 Oct 2021
Personalized Automatic Speech Recognition Trained on Small Disordered
  Speech Datasets
Personalized Automatic Speech Recognition Trained on Small Disordered Speech Datasets
Jimmy Tobin
Katrin Tomanek
11
27
0
09 Oct 2021
Enabling On-Device Training of Speech Recognition Models with Federated
  Dropout
Enabling On-Device Training of Speech Recognition Models with Federated Dropout
Dhruv Guliani
Lillian Zhou
Changwan Ryu
Tien-Ju Yang
Harry Zhang
Yong Xiao
F. Beaufays
Giovanni Motta
FedML
30
16
0
07 Oct 2021
Factorized Neural Transducer for Efficient Language Model Adaptation
Factorized Neural Transducer for Efficient Language Model Adaptation
Xie Chen
Zhong Meng
S. Parthasarathy
Jinyu Li
21
39
0
27 Sep 2021
Tied & Reduced RNN-T Decoder
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
32
55
0
15 Sep 2021
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and
  Accented Speech
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Katrin Tomanek
Vicky Zayats
Dirk Padfield
K. Vaillancourt
Fadi Biadsy
59
57
0
14 Sep 2021
Learning a Neural Diff for Speech Models
Learning a Neural Diff for Speech Models
J. Macoskey
Grant P. Strimel
Ariya Rastrow
13
2
0
03 Aug 2021
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence
  with Deletion Prediction
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction
David Qiu
Yanzhang He
Qiujia Li
Yu Zhang
Liangliang Cao
Ian McGraw
KELM
22
12
0
26 Apr 2021
HMM-Free Encoder Pre-Training for Streaming RNN Transducer
HMM-Free Encoder Pre-Training for Streaming RNN Transducer
Lu Huang
J. Sun
Yu Tang
Junfeng Hou
Jinkun Chen
Jun Zhang
Zejun Ma
25
3
0
02 Apr 2021
Residual Energy-Based Models for End-to-End Speech Recognition
Residual Energy-Based Models for End-to-End Speech Recognition
Qiujia Li
Yu Zhang
Bo-wen Li
Liangliang Cao
P. Woodland
23
13
0
25 Mar 2021
Learning Word-Level Confidence For Subword End-to-End ASR
Learning Word-Level Confidence For Subword End-to-End ASR
David Qiu
Qiujia Li
Yanzhang He
Yu Zhang
Bo-wen Li
...
Deepti Bhatia
Wei Li
Ke Hu
Tara N. Sainath
Ian McGraw
24
32
0
11 Mar 2021
Echo State Speech Recognition
Echo State Speech Recognition
H. Shrivastava
Ankush Garg
Yuan Cao
Yu Zhang
Tara N. Sainath
42
22
0
18 Feb 2021
Less Is More: Improved RNN-T Decoding Using Limited Label Context and
  Path Merging
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
41
35
0
12 Dec 2020
A Better and Faster End-to-End Model for Streaming ASR
A Better and Faster End-to-End Model for Streaming ASR
Bo-wen Li
Anmol Gulati
Jiahui Yu
Tara N. Sainath
Chung-Cheng Chiu
...
Wei Han
Qiao Liang
Yu Zhang
Trevor Strohman
Yonghui Wu
AuLLM
17
123
0
21 Nov 2020
Improving RNN Transducer Based ASR with Auxiliary Tasks
Improving RNN Transducer Based ASR with Auxiliary Tasks
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
26
49
0
05 Nov 2020
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech
  Recognition
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
S. Parthasarathy
Eric Sun
Yashesh Gaur
Naoyuki Kanda
Liang Lu
Xie Chen
Rui Zhao
Jinyu Li
Jiawei Liu
AuLLM
19
107
0
03 Nov 2020
Cascaded encoders for unifying streaming and non-streaming ASR
Cascaded encoders for unifying streaming and non-streaming ASR
A. Narayanan
Tara N. Sainath
Ruoming Pang
Jiahui Yu
Chung-Cheng Chiu
Rohit Prabhavalkar
Ehsan Variani
Trevor Strohman
AuLLM
6
85
0
27 Oct 2020
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable
  End-to-End Speech Recognition
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Wenyong Huang
Wenchao Hu
Y. Yeung
Xiao Chen
9
50
0
13 Aug 2020
Transformer with Bidirectional Decoder for Speech Recognition
Transformer with Bidirectional Decoder for Speech Recognition
Xi Chen
Songyang Zhang
Dandan Song
P. Ouyang
Shouyi Yin
18
13
0
11 Aug 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
48
3,029
0
16 May 2020
1