v1v2 (latest)

RETURNN: The RWTH Extensible Training framework for Universal Recurrent Neural Networks

2 August 2016

Papers citing "RETURNN: The RWTH Extensible Training framework for Universal Recurrent Neural Networks"

33 / 33 papers shown

A Comparative Analysis on ASR System Combination for Attention, CTC, Factored Hybrid, and Transducer Models

184

13 Aug 2025

Analysis of Domain Shift across ASR Architectures via TTS-Enabled Separation of Target Domain and Acoustic Conditions

Tina Raissi

Nick Rossenbach

Ralf Schluter

159

13 Aug 2025

Analyzing the Importance of Blank for CTC-Based Knowledge Distillation

Benedikt Hilmes

Nick Rossenbach

Ralf Schluter

307

02 Jun 2025

MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder

413

21 Sep 2024

Investigating the Effect of Label Topology and Training Criterion on ASR Performance and Alignment Quality

259

16 Jul 2024

On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023

Nick Rossenbach

Benedikt Hilmes

Ralf Schluter

264

12 Oct 2023

End-to-End Training of a Neural HMM with Label and Transition ProbabilitiesAutomatic Speech Recognition & Understanding (ASRU), 2023

290

04 Oct 2023

Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project

Khai-Nguyen Nguyen

265

26 Sep 2023

End-to-End Speech Recognition: A SurveyIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

362

276

03 Mar 2023

Efficient Utilization of Large Pre-Trained Models for Low Resource ASR

340

26 Oct 2022

Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

365

24 Oct 2022

AppTek's Submission to the IWSLT 2022 Isometric Spoken Language Translation TaskInternational Workshop on Spoken Language Translation (IWSLT), 2022

P. Wilken

E. Matusov

168

12 May 2022

Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021

Jinyu Li

VLM

570

448

02 Nov 2021

Automatic Learning of Subword Dependent Model Scales

18 Oct 2021

Differentiable Allophone Graphs for Language-Universal Speech RecognitionInterspeech (Interspeech), 2021

275

24 Jul 2021

Equivalence of Segmental and Neural Transducer Modeling: A Proof of ConceptInterspeech (Interspeech), 2021

259

13 Apr 2021

Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition ArchitecturesAutomatic Speech Recognition & Understanding (ASRU), 2021

245

12 Apr 2021

Early Stage LM Integration Using Local and Global Log-Linear Combination

Wilfried Michel

Ralf Schluter

Hermann Ney

231

20 May 2020

The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugmentIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

218

02 Apr 2020

Attention based on-device streaming speech recognition with large speech corpusAutomatic Speech Recognition & Understanding (ASRU), 2019

Kwangyoun Kim

...

216

02 Jan 2020

Improved Multi-Stage Training of Online Attention-based Encoder-Decoder ModelsAutomatic Speech Recognition & Understanding (ASRU), 2019

Kwangyoun Kim

149

28 Dec 2019

power-law nonlinearity with maximally uniform distribution criterion for improved neural network training in automatic speech recognitionAutomatic Speech Recognition & Understanding (ASRU), 2019

Chanwoo Kim

Mehul Kumar

Kwangyoun Kim

Dhananjaya N. Gowda

188

22 Dec 2019

end-to-end training of a large vocabulary end-to-end speech recognition systemAutomatic Speech Recognition & Understanding (ASRU), 2019

Kwangyoun Kim

...

202

22 Dec 2019

Generating Synthetic Audio Data for Attention-Based Speech Recognition SystemsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019

306

19 Dec 2019

On Using SpecAugment for End-to-End Speech TranslationInternational Workshop on Spoken Language Translation (IWSLT), 2019

244

20 Nov 2019

uniblock: Scoring and Filtering Corpus with Unicode Block InformationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

Yingbo Gao

Weiyue Wang

Hermann Ney

193

26 Aug 2019

Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping SpeechInterspeech (Interspeech), 2019

474

09 May 2019

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data AugmentationInterspeech (Interspeech), 2019

537

240

08 May 2019

RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition

Albert Zeyer

Tamer Alkhouli

Hermann Ney

365

14 May 2018

Improved training of end-to-end attention models for speech recognition

253

280

08 May 2018

A comprehensive study of batch construction strategies for recurrent neural networks in MXNet

P. Doetsch

Pavel Golik

Hermann Ney

162

05 May 2017

Learning to detect and localize many objects from few examples

Bastien Moysset

Christopher Kermorvant

Christian Wolf

ObjD

174

17 Nov 2016

A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016

177

174

22 Jun 2016