ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.07263
  4. Cited By
Single headed attention based sequence-to-sequence model for
  state-of-the-art results on Switchboard
v1v2v3 (latest)

Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard

Interspeech (Interspeech), 2020
20 January 2020
Zoltán Tüske
G. Saon
Kartik Audhkhasi
Brian Kingsbury
    BDL
ArXiv (abs)PDFHTML

Papers citing "Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard"

50 / 52 papers shown
Self-Improvement for Audio Large Language Model using Unlabeled Speech
Self-Improvement for Audio Large Language Model using Unlabeled Speech
S. Wang
Xinyuan Chen
Yao Xu
AuLLM
266
10
0
27 Jul 2025
Iterative Shallow Fusion of Backward Language Model for End-to-End
  Speech Recognition
Iterative Shallow Fusion of Backward Language Model for End-to-End Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
A. Ogawa
Takafumi Moriya
Naoyuki Kamo
Naohiro Tawara
Marc Delcroix
175
3
0
17 Oct 2023
Investigating the Effect of Language Models in Sequence Discriminative
  Training for Neural Transducers
Investigating the Effect of Language Models in Sequence Discriminative Training for Neural TransducersAutomatic Speech Recognition & Understanding (ASRU), 2023
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
212
0
0
11 Oct 2023
On the Relation between Internal Language Model and Sequence
  Discriminative Training for Neural Transducers
On the Relation between Internal Language Model and Sequence Discriminative Training for Neural TransducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
349
1
0
25 Sep 2023
Chunked Attention-based Encoder-Decoder Model for Streaming Speech
  Recognition
Chunked Attention-based Encoder-Decoder Model for Streaming Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Mohammad Zeineldeen
Albert Zeyer
Ralf Schluter
Hermann Ney
AuLLM
400
14
0
15 Sep 2023
Competitive and Resource Efficient Factored Hybrid HMM Systems are
  Simpler Than You Think
Competitive and Resource Efficient Factored Hybrid HMM Systems are Simpler Than You ThinkInterspeech (Interspeech), 2023
Tina Raissi
Christoph Luscher
Moritz Gunz
Ralf Schluter
Hermann Ney
BDL
189
5
0
15 Jun 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A SurveyIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
361
276
0
03 Mar 2023
Confidence Score Based Speaker Adaptation of Conformer Speech
  Recognition Systems
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition SystemsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Jiajun Deng
Xurong Xie
Tianzi Wang
Mingyu Cui
Boyang Xue
Zengrui Jin
Guinan Li
Shujie Hu
Xunying Liu
197
7
0
15 Feb 2023
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural
  Transducers
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural TransducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
297
4
0
07 Dec 2022
Unsupervised Model-based speaker adaptation of end-to-end lattice-free
  MMI model for speech recognition
Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xurong Xie
Xunying Liu
Hui Chen
Hongan Wang
330
1
0
17 Nov 2022
Monotonic segmental attention for automatic speech recognition
Monotonic segmental attention for automatic speech recognitionSpoken Language Technology Workshop (SLT), 2022
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
159
11
0
26 Oct 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and UnderstandingInternational Conference on Machine Learning (ICML), 2022
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
311
199
0
06 Jul 2022
Improving the Training Recipe for a Robust Conformer-based Hybrid Model
Improving the Training Recipe for a Robust Conformer-based Hybrid ModelInterspeech (Interspeech), 2022
Mohammad Zeineldeen
Jingjing Xu
Christoph Luscher
Ralf Schluter
Hermann Ney
218
21
0
26 Jun 2022
Confidence Score Based Conformer Speaker Adaptation for Speech
  Recognition
Confidence Score Based Conformer Speaker Adaptation for Speech RecognitionInterspeech (Interspeech), 2022
Jiajun Deng
Xurong Xie
Tianzi Wang
Mingyu Cui
Boyang Xue
Zengrui Jin
Mengzhe Geng
Guinan Li
Xunying Liu
Helen M. Meng
232
15
0
24 Jun 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System
  on the 300-hr Switchboard Corpus
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard CorpusInterspeech (Interspeech), 2022
Junhao Xu
Shoukang Hu
Xunying Liu
Helen M. Meng
MQ
270
5
0
23 Jun 2022
Two-pass Decoding and Cross-adaptation Based System Combination of
  End-to-end Conformer and Hybrid TDNN ASR Systems
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR SystemsInterspeech (Interspeech), 2022
Mingyu Cui
Jiajun Deng
Shoukang Hu
Xurong Xie
Tianzi Wang
Shujie Hu
Mengzhe Geng
Boyang Xue
Xunying Liu
Helen M. Meng
197
10
0
23 Jun 2022
Accelerating Inference and Language Model Fusion of Recurrent Neural
  Network Transducers via End-to-End 4-bit Quantization
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit QuantizationInterspeech (Interspeech), 2022
A. Fasoli
Chia-Yu Chen
Mauricio Serrano
Swagath Venkataramani
G. Saon
Xiaodong Cui
Brian Kingsbury
K. Gopalakrishnan
MQ
208
7
0
16 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
LegoNN: Building Modular Encoder-Decoder ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLMMoE
232
16
0
07 Jun 2022
Efficient Training of Neural Transducer for Speech Recognition
Efficient Training of Neural Transducer for Speech RecognitionInterspeech (Interspeech), 2022
Wei Zhou
Wilfried Michel
Ralf Schluter
Hermann Ney
AI4TS
245
28
0
22 Apr 2022
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in
  End-to-End Speech-to-Intent Systems
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent SystemsInterspeech (Interspeech), 2022
Vishal Sunder
Eric Fosler-Lussier
Samuel Thomas
H. Kuo
Brian Kingsbury
438
8
0
11 Apr 2022
Effect and Analysis of Large-scale Language Model Rescoring on
  Competitive ASR Systems
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR SystemsInterspeech (Interspeech), 2022
Takuma Udagawa
Masayuki Suzuki
Gakuto Kurata
N. Itoh
G. Saon
366
30
0
01 Apr 2022
Improving End-to-End Models for Set Prediction in Spoken Language
  Understanding
Improving End-to-End Models for Set Prediction in Spoken Language UnderstandingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
H. Kuo
Zoltán Tüske
Samuel Thomas
Brian Kingsbury
G. Saon
154
0
0
28 Jan 2022
Improving Factored Hybrid HMM Acoustic Modeling without State Tying
Improving Factored Hybrid HMM Acoustic Modeling without State TyingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Tina Raissi
Eugen Beck
Ralf Schluter
Hermann Ney
306
5
0
24 Jan 2022
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks
Neural Architecture Search For LF-MMI Trained Time Delay Neural NetworksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Shou-Yong Hu
Xurong Xie
Mingyu Cui
Jiajun Deng
Shansong Liu
Jianwei Yu
Mengzhe Geng
Xunying Liu
Helen Meng
310
28
0
08 Jan 2022
Robust Self-Supervised Audio-Visual Speech Recognition
Robust Self-Supervised Audio-Visual Speech RecognitionInterspeech (Interspeech), 2022
Bowen Shi
Wei-Ning Hsu
Abdel-rahman Mohamed
411
123
0
05 Jan 2022
Mixed Precision Low-bit Quantization of Neural Network Language Models
  for Speech Recognition
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Junhao Xu
Jianwei Yu
Shoukang Hu
Xunying Liu
Helen Meng
MQ
317
20
0
29 Nov 2021
Conformer-based Hybrid ASR System for Switchboard Dataset
Conformer-based Hybrid ASR System for Switchboard Dataset
Mohammad Zeineldeen
Jingjing Xu
Christoph Luscher
Wilfried Michel
Alexander Gerstenberger
Ralf Schluter
Hermann Ney
344
27
0
05 Nov 2021
On Language Model Integration for RNN Transducer based Speech
  Recognition
On Language Model Integration for RNN Transducer based Speech Recognition
Wei Zhou
Zuoyun Zheng
Ralf Schluter
Hermann Ney
336
27
0
13 Oct 2021
ChannelAugment: Improving generalization of multi-channel ASR by
  training with input channel randomization
ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomizationAutomatic Speech Recognition & Understanding (ASRU), 2021
M. Gaudesi
F. Weninger
D. Sharma
P. Zhan
AAML
208
1
0
23 Sep 2021
Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk
  and Far-Talk Speech Recognition
Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk and Far-Talk Speech Recognition
F. Weninger
M. Gaudesi
Ralf Leibold
R. Gemello
P. Zhan
176
4
0
17 Sep 2021
4-bit Quantization of LSTM-based Speech Recognition Models
4-bit Quantization of LSTM-based Speech Recognition ModelsInterspeech (Interspeech), 2021
A. Fasoli
Chia-Yu Chen
Mauricio Serrano
Xiao Sun
Naigang Wang
...
Xiaodong Cui
Brian Kingsbury
Wei Zhang
Zoltán Tüske
K. Gopalakrishnan
MQ
183
24
0
27 Aug 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer
  Models via Low-Rank Approximation
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
228
12
0
24 Aug 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Reducing Exposure Bias in Training Recurrent Neural Network TransducersInterspeech (Interspeech), 2021
Xiaodong Cui
Brian Kingsbury
G. Saon
David Haws
Zoltán Tüske
155
6
0
24 Aug 2021
Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR
  Models using Hybrid Generated Pseudotranscripts
Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
O. Kimball
229
4
0
14 Jun 2021
On the limit of English conversational speech recognition
On the limit of English conversational speech recognitionInterspeech (Interspeech), 2021
Zoltán Tüske
G. Saon
Brian Kingsbury
299
54
0
03 May 2021
Advanced Long-context End-to-end Speech Recognition Using
  Context-expanded Transformers
Advanced Long-context End-to-end Speech Recognition Using Context-expanded TransformersInterspeech (Interspeech), 2021
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
199
40
0
19 Apr 2021
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Acoustic Data-Driven Subword Modeling for End-to-End Speech RecognitionInterspeech (Interspeech), 2021
Wei Zhou
Mohammad Zeineldeen
Zuoyun Zheng
Ralf Schluter
Hermann Ney
300
14
0
19 Apr 2021
Equivalence of Segmental and Neural Transducer Modeling: A Proof of
  Concept
Equivalence of Segmental and Neural Transducer Modeling: A Proof of ConceptInterspeech (Interspeech), 2021
Wei Zhou
Albert Zeyer
André Merboldt
Ralf Schluter
Hermann Ney
257
6
0
13 Apr 2021
Investigating Methods to Improve Language Model Integration for
  Attention-based Encoder-Decoder ASR Models
Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR ModelsInterspeech (Interspeech), 2021
Mohammad Zeineldeen
Aleksandr Glushko
Wilfried Michel
Albert Zeyer
Ralf Schluter
Hermann Ney
AuLLM
244
45
0
12 Apr 2021
Comparing the Benefit of Synthetic Training Data for Various Automatic
  Speech Recognition Architectures
Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition ArchitecturesAutomatic Speech Recognition & Understanding (ASRU), 2021
Nick Rossenbach
Mohammad Zeineldeen
Benedikt Hilmes
Ralf Schluter
Hermann Ney
243
12
0
12 Apr 2021
Towards Consistent Hybrid HMM Acoustic Modeling
Towards Consistent Hybrid HMM Acoustic Modeling
Tina Raissi
Eugen Beck
Ralf Schluter
Hermann Ney
404
5
0
06 Apr 2021
A study of latent monotonic attention variants
A study of latent monotonic attention variants
Albert Zeyer
Ralf Schluter
Hermann Ney
307
5
0
30 Mar 2021
Residual Energy-Based Models for End-to-End Speech Recognition
Residual Energy-Based Models for End-to-End Speech RecognitionInterspeech (Interspeech), 2021
Qiujia Li
Yu Zhang
Yue Liu
Liangliang Cao
P. Woodland
229
15
0
25 Mar 2021
Advancing RNN Transducer Technology for Speech Recognition
Advancing RNN Transducer Technology for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
G. Saon
Zoltan Tueske
Daniel Bolaños
Brian Kingsbury
294
104
0
17 Mar 2021
End-to-End Dereverberation, Beamforming, and Speech Recognition with
  Improved Numerical Stability and Advanced Frontend
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced FrontendIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Wangyou Zhang
Christoph Boeddeker
Shinji Watanabe
Tomohiro Nakatani
Marc Delcroix
K. Kinoshita
Tsubasa Ochiai
Naoyuki Kamo
Reinhold Haeb-Umbach
Y. Qian
172
37
0
23 Feb 2021
Bayesian Learning for Deep Neural Network Adaptation
Bayesian Learning for Deep Neural Network AdaptationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Xurong Xie
Xunying Liu
Tan Lee
Lan Wang
BDL
530
27
0
14 Dec 2020
Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Phoneme Based Neural Transducer for Large Vocabulary Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Wei Zhou
Simon Berger
Ralf Schluter
Hermann Ney
374
34
0
30 Oct 2020
Super-Human Performance in Online Low-latency Recognition of
  Conversational Speech
Super-Human Performance in Online Low-latency Recognition of Conversational Speech
T. Nguyen
S. Stueker
A. Waibel
BDL
434
43
0
07 Oct 2020
End-to-End Spoken Language Understanding Without Full Transcripts
End-to-End Spoken Language Understanding Without Full Transcripts
H. Kuo
Zoltán Tüske
Samuel Thomas
Yinghui Huang
Kartik Audhkhasi
Brian Kingsbury
Gakuto Kurata
Zvi Kons
R. Hoory
Luis A. Lastras
AuLLM
232
28
0
30 Sep 2020
Semi-Supervised Learning with Data Augmentation for End-to-End ASR
Semi-Supervised Learning with Data Augmentation for End-to-End ASRInterspeech (Interspeech), 2020
F. Weninger
F. Mana
R. Gemello
Jesús Andrés-Ferrer
P. Zhan
265
32
0
27 Jul 2020
12
Next
Page 1 of 2