ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.10992
  4. Cited By
Two-Pass End-to-End Speech Recognition

Two-Pass End-to-End Speech Recognition

29 August 2019
Tara N. Sainath
Ruoming Pang
David Rybach
Yanzhang He
Rohit Prabhavalkar
Wei Li
Mirkó Visontai
Qiao Liang
Trevor Strohman
Yonghui Wu
Ian McGraw
Chung-Cheng Chiu
ArXivPDFHTML

Papers citing "Two-Pass End-to-End Speech Recognition"

50 / 98 papers shown
Title
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
Kai-Tuo Xu
Feng-Long Xie
Xu Tang
Yao Hu
69
4
0
24 Jan 2025
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
AuLLM
40
0
0
08 Jan 2025
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Prashanth Gurunath Shivakumar
J. Kolehmainen
Aditya Gourav
Yi Gu
Ankur Gandhe
Ariya Rastrow
I. Bulyko
AuLLM
26
0
0
25 Sep 2024
FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications
FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications
Hao-Han Guo
Kun Liu
Fei-Yu Shen
Yi-Chen Wu
Xu Tang
Kun Xie
Kai-Tuo Xu
Kun Xie
Kai-Tuo Xu
42
20
0
05 Sep 2024
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with
  Multi-Pass Augmented Generative Error Correction
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Yuka Ko
Sheng Li
Chao-Han Huck Yang
Tatsuya Kawahara
AuLLM
31
3
0
29 Aug 2024
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of
  Language Models
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models
Bolaji Yusuf
M. Baskar
Andrew Rosenberg
Bhuvana Ramabhadran
37
1
0
05 Jul 2024
Decoder-only Architecture for Streaming End-to-end Speech Recognition
Decoder-only Architecture for Streaming End-to-end Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
RALM
AuLLM
36
6
0
23 Jun 2024
Towards Effective and Efficient Non-autoregressive Decoding Using
  Block-based Attention Mask
Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Tianzi Wang
Xurong Xie
Zhaoqing Li
Shoukang Hu
Zengrui Jin
...
Shujie Hu
Mengzhe Geng
Guinan Li
Helen Meng
Xunying Liu
34
0
0
14 Jun 2024
Transformer-based Model for ASR N-Best Rescoring and Rewriting
Transformer-based Model for ASR N-Best Rescoring and Rewriting
Iwen E. Kang
Christophe Van Gysel
Man-Hung Siu
34
2
0
12 Jun 2024
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders
Yui Sudo
Muhammad Shakeel
Yosuke Fukumoto
Brian Yan
Jiatong Shi
Yifan Peng
Shinji Watanabe
19
0
0
05 Jun 2024
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with
  LLMs for Multi-modal Text Recognition
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition
Chan-Jan Hsu
Yi-Chang Chen
Feng-Ting Liao
Pei-Chen Ho
Yu-Hsiang Wang
Po-Chun Hsu
Da-shan Shiu
29
2
0
23 May 2024
It's Never Too Late: Fusing Acoustic Information into Large Language
  Models for Automatic Speech Recognition
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
Chen Chen
Ruizhe Li
Yuchen Hu
Sabato Marco Siniscalchi
Pin-Yu Chen
Ensiong Chng
Chao-Han Huck Yang
28
19
0
08 Feb 2024
Contextualized Automatic Speech Recognition with Attention-Based Bias
  Phrase Boosted Beam Search
Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search
Yui Sudo
Muhammad Shakeel
Yosuke Fukumoto
Yifan Peng
Shinji Watanabe
26
5
0
19 Jan 2024
Two-pass Endpoint Detection for Speech Recognition
Two-pass Endpoint Detection for Speech Recognition
A. Raju
Aparna Khare
Di He
Ilya Sklyar
Long Chen
...
Zhe Zhang
Colin Vaz
Venkatesh Ravichandran
Roland Maas
Ariya Rastrow
33
0
0
17 Jan 2024
On the compression of shallow non-causal ASR models using knowledge
  distillation and tied-and-reduced decoder for low-latency on-device speech
  recognition
On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition
Nagaraj Adiga
Jinhwan Park
Chintigari Shiva Kumar
Shatrughan Singh
Kyungmin Lee
Chanwoo Kim
Dhananjaya N. Gowda
18
1
0
15 Dec 2023
Low-rank Adaptation of Large Language Model Rescoring for
  Parameter-Efficient Speech Recognition
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
Yu Yu
Chao-Han Huck Yang
J. Kolehmainen
Prashanth Gurunath Shivakumar
Yile Gu
...
Denis Filimonov
Shalini Ghosh
A. Stolcke
Ariya Rastrow
I. Bulyko
31
8
0
26 Sep 2023
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Jinchuan Tian
Jianwei Yu
Hangting Chen
Brian Yan
Chao Weng
Dong Yu
Shinji Watanabe
33
1
0
19 Aug 2023
Improving CTC-AED model with integrated-CTC and auxiliary loss
  regularization
Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Daobin Zhu
Xiangdong Su
Hongbin Zhang
13
1
0
15 Aug 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming
  Encoder-decoder Speech Recognition
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
24
4
0
24 Jul 2023
Modality Confidence Aware Training for Robust End-to-End Spoken Language
  Understanding
Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Suyoun Kim
Akshat Shrivastava
Duc Le
Ju Lin
Ozlem Kalinli
M. Seltzer
AuLLM
25
2
0
22 Jul 2023
Exploring the Integration of Large Language Models into Automatic Speech
  Recognition Systems: An Empirical Study
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Zeping Min
Jinbo Wang
AuLLM
27
13
0
13 Jul 2023
DCTX-Conformer: Dynamic context carry-over for low latency unified
  streaming and non-streaming Conformer ASR
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR
Goeric Huybrechts
S. Ronanki
Xilai Li
H. Nosrati
S. Bodapati
Katrin Kirchhoff
18
1
0
13 Jun 2023
Enhancing the Unified Streaming and Non-streaming Model with Contrastive
  Learning
Enhancing the Unified Streaming and Non-streaming Model with Contrastive Learning
Yuting Yang
Yuke Li
Binbin Du
AI4TS
25
0
0
01 Jun 2023
TAPIR: Learning Adaptive Revision for Incremental Natural Language
  Understanding with a Two-Pass Model
TAPIR: Learning Adaptive Revision for Incremental Natural Language Understanding with a Two-Pass Model
Patrick Kahardipraja
Brielen Madureira
David Schlangen
CLL
26
9
0
18 May 2023
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Jason (Jinglun) Cai
Monica Sunkara
Xilai Li
Anshu Bhatia
Xiao Pan
S. Bodapati
26
3
0
11 May 2023
Hybrid Transducer and Attention based Encoder-Decoder Modeling for
  Speech-to-Text Tasks
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Yun Tang
Anna Y. Sun
H. Inaguma
Xinyue Chen
Ning Dong
Xutai Ma
Paden Tomasello
J. Pino
40
19
0
04 May 2023
Enhancing multilingual speech recognition in air traffic control by
  sentence-level language identification
Enhancing multilingual speech recognition in air traffic control by sentence-level language identification
Peng Fan
Dongyue Guo
Jianwei Zhang
Bo Yang
Yi Lin
17
6
0
29 Apr 2023
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming
  Conformer ASR
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR
Xilai Li
Goeric Huybrechts
S. Ronanki
Jeffrey J. Farris
S. Bodapati
33
6
0
18 Apr 2023
Cross-utterance ASR Rescoring with Graph-based Label Propagation
Cross-utterance ASR Rescoring with Graph-based Label Propagation
Srinath Tankasala
Long Chen
A. Stolcke
A. Raju
Qianli Deng
Chander Chandak
Aparna Khare
Roland Maas
Venkatesh Ravichandran
18
0
0
27 Mar 2023
A Deliberation-based Joint Acoustic and Text Decoder
A Deliberation-based Joint Acoustic and Text Decoder
S. Mavandadi
Tara N. Sainath
Ke Hu
Zelin Wu
21
7
0
23 Mar 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
148
0
03 Mar 2023
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict
  decoders
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders
Yui Sudo
Muhammad Shakeel
Brian Yan
Jiatong Shi
Shinji Watanabe
17
10
0
21 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
H. Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
43
51
0
15 Dec 2022
E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model
E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model
W. R. Huang
Shuo-yiin Chang
Tara N. Sainath
Yanzhang He
David Rybach
R. David
Rohit Prabhavalkar
Cyril Allauzen
Cal Peyser
Trevor Strohman
35
7
0
28 Nov 2022
A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models
  for Spoken Language Understanding
A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Yifan Peng
Siddhant Arora
Yosuke Higuchi
Yushi Ueda
Sujay S. Kumar
Karthik Ganesan
Siddharth Dalmia
Xuankai Chang
Shinji Watanabe
19
20
0
10 Nov 2022
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
51
12
0
02 Nov 2022
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech
  Recognition
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Suyoun Kim
Ke Li
Lucas Kabela
Rongqing Huang
Jiedan Zhu
Ozlem Kalinli
Duc Le
25
8
0
31 Oct 2022
Modular Hybrid Autoregressive Transducer
Modular Hybrid Autoregressive Transducer
Zhong Meng
Tongzhou Chen
Rohit Prabhavalkar
Yu Zhang
Gary Wang
...
Bhuvana Ramabhadran
W. R. Huang
Ehsan Variani
Yinghui Huang
Pedro J. Moreno
31
20
0
31 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with
  Pre-trained Masked Language Model
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
54
25
0
29 Oct 2022
Monotonic segmental attention for automatic speech recognition
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
13
8
0
26 Oct 2022
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample
  Decoding
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Ruchao Fan
Guoli Ye
Yashesh Gaur
Jinyu Li
11
4
0
16 Oct 2022
Scaling Up Deliberation for Multilingual ASR
Scaling Up Deliberation for Multilingual ASR
Ke Hu
Bo-wen Li
Tara N. Sainath
LRM
23
9
0
11 Oct 2022
CTC Alignments Improve Autoregressive Translation
CTC Alignments Improve Autoregressive Translation
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
44
33
0
11 Oct 2022
Multi-stage Progressive Compression of Conformer Transducer for
  On-device Speech Recognition
Multi-stage Progressive Compression of Conformer Transducer for On-device Speech Recognition
Jash Rathod
Nauman Dawalatabad
Shatrughan Singh
Dhananjaya N. Gowda
17
9
0
01 Oct 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
33
9
0
24 Jul 2022
Two-Pass Low Latency End-to-End Spoken Language Understanding
Two-Pass Low Latency End-to-End Spoken Language Understanding
Siddhant Arora
Siddharth Dalmia
Xuankai Chang
Brian Yan
A. Black
Shinji Watanabe
VLM
22
19
0
14 Jul 2022
Improving Deliberation by Text-Only and Semi-Supervised Training
Improving Deliberation by Text-Only and Semi-Supervised Training
Ke Hu
Tara N. Sainath
Yanzhang He
Rohit Prabhavalkar
Trevor Strohman
S. Mavandadi
Weiran Wang
26
12
0
29 Jun 2022
Conformer Based Elderly Speech Recognition System for Alzheimer's
  Disease Detection
Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection
Tianzi Wang
Jiajun Deng
Mengzhe Geng
Zi Ye
Shoukang Hu
Yi Wang
Mingyu Cui
Zengrui Jin
Xunying Liu
Helen M. Meng
17
20
0
23 Jun 2022
Two-pass Decoding and Cross-adaptation Based System Combination of
  End-to-end Conformer and Hybrid TDNN ASR Systems
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Mingyu Cui
Jiajun Deng
Shoukang Hu
Xurong Xie
Tianzi Wang
Shujie Hu
Mengzhe Geng
Boyang Xue
Xunying Liu
Helen M. Meng
31
9
0
23 Jun 2022
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
W. R. Huang
Shuo-yiin Chang
David Rybach
Rohit Prabhavalkar
Tara N. Sainath
Cyril Allauzen
Cal Peyser
Zhiyun Lu
VLM
31
24
0
22 Apr 2022
12
Next