ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.10240
  4. Cited By
Improving the fusion of acoustic and text representations in RNN-T

Improving the fusion of acoustic and text representations in RNN-T

25 January 2022
Chao Zhang
Bo-wen Li
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
    AI4CE
ArXivPDFHTML

Papers citing "Improving the fusion of acoustic and text representations in RNN-T"

13 / 13 papers shown
Title
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech
  Recognition
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
Tian-Hao Zhang
Dinghao Zhou
Guiping Zhong
Jiaming Zhou
Baoxiang Li
10
3
0
26 Jul 2023
Improving RNN-Transducers with Acoustic LookAhead
Improving RNN-Transducers with Acoustic LookAhead
Vinit Unni
Ashish R. Mittal
P. Jyothi
Sunita Sarawagi
11
2
0
11 Jul 2023
Multi-View Frequency-Attention Alternative to CNN Frontends for
  Automatic Speech Recognition
Multi-View Frequency-Attention Alternative to CNN Frontends for Automatic Speech Recognition
Belen Alastruey
Lukas Drude
Jahn Heymann
Simon Wiesler
12
1
0
12 Jun 2023
Graph Neural Networks for Contextual ASR with the Tree-Constrained
  Pointer Generator
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
C. Zhang
P. Woodland
11
4
0
30 May 2023
Diagonal State Space Augmented Transformers for Speech Recognition
Diagonal State Space Augmented Transformers for Speech Recognition
G. Saon
Ankit Gupta
Xiaodong Cui
AI4TS
22
26
0
27 Feb 2023
UML: A Universal Monolingual Output Layer for Multilingual ASR
UML: A Universal Monolingual Output Layer for Multilingual ASR
Chaoyang Zhang
Bo-wen Li
Tara N. Sainath
Trevor Strohman
Shuo-yiin Chang
13
7
0
22 Feb 2023
Contextual-Utterance Training for Automatic Speech Recognition
Contextual-Utterance Training for Automatic Speech Recognition
Alejandro Gomez-Alanis
Lukas Drude
A. Schwarz
R. Swaminathan
Simon Wiesler
11
1
0
27 Oct 2022
Minimising Biasing Word Errors for Contextual ASR with the
  Tree-Constrained Pointer Generator
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
C. Zhang
P. Woodland
14
14
0
18 May 2022
Large-Scale Streaming End-to-End Speech Translation with Neural
  Transducers
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
17
26
0
11 Apr 2022
Adaptive Discounting of Implicit Language Models in RNN-Transducers
Adaptive Discounting of Implicit Language Models in RNN-Transducers
Vinit Unni
Shreya Khare
Ashish R. Mittal
P. Jyothi
Sunita Sarawagi
Samarth Bharadwaj
12
3
0
21 Feb 2022
Internal Language Model Training for Domain-Adaptive End-to-End Speech
  Recognition
Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
Naoyuki Kanda
Yashesh Gaur
S. Parthasarathy
Eric Sun
Liang Lu
Xie Chen
Jinyu Li
Y. Gong
AuLLM
19
52
0
02 Feb 2021
Improving Streaming Automatic Speech Recognition With Non-Streaming
  Model Distillation On Unsupervised Data
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data
Thibault Doutre
Wei Han
Min Ma
Zhiyun Lu
Chung-Cheng Chiu
Ruoming Pang
A. Narayanan
Ananya Misra
Yu Zhang
Liangliang Cao
52
22
0
22 Oct 2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low
  Latency Streaming Speech Recognition
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
49
168
0
21 Oct 2020
1