ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.05420
  4. Cited By
A Study of Transducer based End-to-End ASR with ESPnet: Architecture,
  Auxiliary Loss and Decoding Strategies

A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies

14 January 2022
Florian Boyer
Yusuke Shinohara
Takaaki Ishii
H. Inaguma
Shinji Watanabe
ArXivPDFHTML

Papers citing "A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies"

22 / 22 papers shown
Title
Boosting Hybrid Autoregressive Transducer-based ASR with Internal
  Acoustic Model Training and Dual Blank Thresholding
Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Takafumi Moriya
Takanori Ashihara
Masato Mimura
Hiroshi Sato
Kohei Matsuura
Ryo Masumura
Taichi Asami
11
0
0
30 Sep 2024
Paraformer-v2: An improved non-autoregressive transformer for
  noise-robust speech recognition
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
Keyu An
Zerui Li
Zhifu Gao
Shiliang Zhang
25
0
0
26 Sep 2024
LAHAJA: A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems
LAHAJA: A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems
Tahir Javed
J. Nawale
Sakshi Joshi
E. George
Kaushal Bhogale
Deovrat Mehendale
Mitesh M. Khapra
AuLLM
38
1
0
21 Aug 2024
Token-Weighted RNN-T for Learning from Flawed Data
Token-Weighted RNN-T for Learning from Flawed Data
Gil Keren
Wei Zhou
Ozlem Kalinli
25
0
0
26 Jun 2024
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model
  Improves End-to-End ASR
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
Jintao Jiang
Yingbo Gao
Mohammad Zeineldeen
Zoltán Tüske
21
0
0
23 Feb 2024
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Jintao Jiang
Yingbo Gao
Zoltán Tüske
16
1
0
24 Nov 2023
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech
  Recognition
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Huaibo Zhao
Yosuke Higuchi
Yusuke Kida
Tetsuji Ogawa
Tetsunori Kobayashi
13
1
0
09 Sep 2023
Decoupled Structure for Improved Adaptability of End-to-End Models
Decoupled Structure for Improved Adaptability of End-to-End Models
Keqi Deng
P. Woodland
AuLLM
14
2
0
25 Aug 2023
Globally Normalising the Transducer for Streaming Speech Recognition
Globally Normalising the Transducer for Streaming Speech Recognition
Rogier van Dalen
19
0
0
20 Jul 2023
A Token-Wise Beam Search Algorithm for RNN-T
A Token-Wise Beam Search Algorithm for RNN-T
Gil Keren
8
1
0
28 Feb 2023
Minimum Latency Training of Sequence Transducers for Streaming
  End-to-End Speech Recognition
Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
Yusuke Shinohara
Shinji Watanabe
AI4TS
10
8
0
04 Nov 2022
Phonetic-assisted Multi-Target Units Modeling for Improving
  Conformer-Transducer ASR system
Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Li Li
Dongxing Xu
Haoran Wei
Yanhua Long
8
2
0
03 Nov 2022
Conversation-oriented ASR with multi-look-ahead CBS architecture
Conversation-oriented ASR with multi-look-ahead CBS architecture
Huaibo Zhao
S. Fujie
Tetsuji Ogawa
Jin Sakuma
Yusuke Kida
Tetsunori Kobayashi
6
3
0
02 Nov 2022
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
41
12
0
02 Nov 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with
  Pre-trained Masked Language Model
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
35
25
0
29 Oct 2022
Foundation Transformers
Foundation Transformers
Hongyu Wang
Shuming Ma
Shaohan Huang
Li Dong
Wenhui Wang
...
Barun Patra
Zhun Liu
Vishrav Chaudhary
Xia Song
Furu Wei
AI4CE
11
27
0
12 Oct 2022
When Is TTS Augmentation Through a Pivot Language Useful?
When Is TTS Augmentation Through a Pivot Language Useful?
Nathaniel R. Robinson
Perez Ogayo
Swetha Gangu
David R. Mortensen
Shinji Watanabe
4
9
0
20 Jul 2022
An Investigation of Monotonic Transducers for Large-Scale Automatic
  Speech Recognition
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Niko Moritz
Frank Seide
Duc Le
Jay Mahadeokar
Christian Fuegen
10
8
0
19 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
17
8
0
31 Mar 2022
Sequence Transduction with Graph-based Supervision
Sequence Transduction with Graph-based Supervision
Niko Moritz
Takaaki Hori
Shinji Watanabe
Jonathan Le Roux
11
5
0
01 Nov 2021
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech
  Recognition
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition
Chao Weng
Chengzhu Yu
Jia Cui
Chunlei Zhang
Dong Yu
59
39
0
28 Nov 2019
NeMo: a toolkit for building AI applications using Neural Modules
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
174
287
0
14 Sep 2019
1