ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11148
  4. Cited By
FastEmit: Low-latency Streaming ASR with Sequence-level Emission
  Regularization

FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization

21 October 2020
Jiahui Yu
Chung-Cheng Chiu
Bo-wen Li
Shuo-yiin Chang
Tara N. Sainath
Yanzhang He
A. Narayanan
Wei Han
Anmol Gulati
Yonghui Wu
Ruoming Pang
ArXivPDFHTML

Papers citing "FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization"

26 / 26 papers shown
Title
Efficient Adapter Finetuning for Tail Languages in Streaming
  Multilingual ASR
Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR
Junwen Bai
Bo-wen Li
Qiujia Li
Tara N. Sainath
Trevor Strohman
30
3
0
17 Jan 2024
Two-pass Endpoint Detection for Speech Recognition
Two-pass Endpoint Detection for Speech Recognition
A. Raju
Aparna Khare
Di He
Ilya Sklyar
Long Chen
...
Zhe Zhang
Colin Vaz
Venkatesh Ravichandran
Roland Maas
Ariya Rastrow
33
0
0
17 Jan 2024
Stateful Conformer with Cache-based Inference for Streaming Automatic
  Speech Recognition
Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Vahid Noroozi
Somshubra Majumdar
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
23
10
0
27 Dec 2023
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and
  Non-Destructive Multi-task Speech Recognition
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition
Hillary Ngai
Rohan Agrawal
Neeraj Gaur
Ronny Huang
Parisa Haghani
P. M. Mengibar
MoMe
36
0
0
17 Oct 2023
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Jinchuan Tian
Jianwei Yu
Hangting Chen
Brian Yan
Chao Weng
Dong Yu
Shinji Watanabe
35
1
0
19 Aug 2023
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech
  Recognition
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
Tian-Hao Zhang
Dinghao Zhou
Guiping Zhong
Jiaming Zhou
Baoxiang Li
20
3
0
26 Jul 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming
  Encoder-decoder Speech Recognition
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
27
4
0
24 Jul 2023
Align With Purpose: Optimize Desired Properties in CTC Models with a
  General Plug-and-Play Framework
Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Eliya Segev
Maya Alroy
Ronen Katsir
Noam Wies
Ayana Shenhav
...
D. Zar
Oren Tadmor
Jacob Bitterman
Amnon Shashua
Tal Rosenwein
30
2
0
04 Jul 2023
Streaming Speech-to-Confusion Network Speech Recognition
Streaming Speech-to-Confusion Network Speech Recognition
Denis Filimonov
Prabhat Pandey
Ariya Rastrow
Ankur Gandhe
A. Stolcke
HAI
24
0
0
02 Jun 2023
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
Xingcheng Song
Di Wu
Binbin Zhang
Zhendong Peng
Bo Dang
Fuping Pan
Zhiyong Wu
35
20
0
18 May 2023
Accelerator-Aware Training for Transducer-Based Speech Recognition
Accelerator-Aware Training for Transducer-Based Speech Recognition
Suhaila M. Shakiah
R. Swaminathan
Hieu Duy Nguyen
Raviteja Chinta
Tariq Afzal
Nathan Susanj
Athanasios Mouchtaris
Grant P. Strimel
Ariya Rastrow
19
1
0
12 May 2023
Efficient Sequence Transduction by Jointly Predicting Tokens and
  Durations
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
27
17
0
13 Apr 2023
Minimum Latency Training of Sequence Transducers for Streaming
  End-to-End Speech Recognition
Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
Yusuke Shinohara
Shinji Watanabe
AI4TS
21
9
0
04 Nov 2022
Variable Attention Masking for Configurable Transformer Transducer
  Speech Recognition
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
P. Swietojanski
Stefan Braun
Dogan Can
Thiago Fraga da Silva
Arnab Ghoshal
...
Henry Mason
Erik McDermott
Honza Silovsky
R. Travadi
Xiaodan Zhuang
32
13
0
02 Nov 2022
Unified End-to-End Speech Recognition and Endpointing for Fast and
  Efficient Speech Systems
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems
Shaan Bijwadia
Shuo-yiin Chang
Bo-wen Li
Tara N. Sainath
Chaoyang Zhang
Yanzhang He
33
7
0
01 Nov 2022
TrimTail: Low-Latency Streaming ASR with Simple but Effective
  Spectrogram-Level Length Penalty
TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Xingcheng Song
Di Wu
Zhiyong Wu
Binbin Zhang
Yuekai Zhang
Zhendong Peng
Wenpeng Li
Fuping Pan
Changbao Zhu
26
8
0
01 Nov 2022
Delay-penalized transducer for low-latency streaming ASR
Delay-penalized transducer for low-latency streaming ASR
Wei Kang
Zengwei Yao
Fangjun Kuang
Liyong Guo
Xiaoyu Yang
Long lin
Piotr Żelasko
Daniel Povey
13
6
0
31 Oct 2022
JOIST: A Joint Speech and Text Streaming Model For ASR
JOIST: A Joint Speech and Text Streaming Model For ASR
Tara N. Sainath
Rohit Prabhavalkar
Ankur Bapna
Yu Zhang
Zhouyuan Huo
Zhehuai Chen
Bo-wen Li
Weiran Wang
Trevor Strohman
RALM
AuLLM
48
35
0
13 Oct 2022
Turn-Taking Prediction for Natural Conversational Speech
Turn-Taking Prediction for Natural Conversational Speech
Shuo-yiin Chang
Bo-wen Li
Tara N. Sainath
Chaoyang Zhang
Trevor Strohman
Qiao Liang
Yanzhang He
35
17
0
29 Aug 2022
Streaming parallel transducer beam search with fast-slow cascaded
  encoders
Streaming parallel transducer beam search with fast-slow cascaded encoders
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
27
15
0
29 Mar 2022
VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end
  Long-form Speech Recognition
VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Jinhan Wang
Xiaosu Tong
Jinxi Guo
Di He
Roland Maas
11
5
0
22 Feb 2022
Are E2E ASR models ready for an industrial usage?
Are E2E ASR models ready for an industrial usage?
Valentin Vielzeuf
G. Antipov
18
8
0
09 Dec 2021
Streaming Transformer Transducer Based Speech Recognition Using
  Non-Causal Convolution
Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Yangyang Shi
Chunyang Wu
Dilin Wang
Alex Xiao
Jay Mahadeokar
...
Ke Li
Yuan Shangguan
Varun K. Nagaraja
Ozlem Kalinli
M. Seltzer
33
15
0
07 Oct 2021
Optimizing Latency for Online Video CaptioningUsing Audio-Visual
  Transformers
Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Chiori Hori
Takaaki Hori
Jonathan Le Roux
22
4
0
04 Aug 2021
A Better and Faster End-to-End Model for Streaming ASR
A Better and Faster End-to-End Model for Streaming ASR
Bo-wen Li
Anmol Gulati
Jiahui Yu
Tara N. Sainath
Chung-Cheng Chiu
...
Wei Han
Qiao Liang
Yu Zhang
Trevor Strohman
Yonghui Wu
AuLLM
17
123
0
21 Nov 2020
Cascaded encoders for unifying streaming and non-streaming ASR
Cascaded encoders for unifying streaming and non-streaming ASR
A. Narayanan
Tara N. Sainath
Ruoming Pang
Jiahui Yu
Chung-Cheng Chiu
Rohit Prabhavalkar
Ehsan Variani
Trevor Strohman
AuLLM
6
85
0
27 Oct 2020
1