Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.01690
Cited By
Recent Advances in End-to-End Automatic Speech Recognition
2 November 2021
Jinyu Li
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Recent Advances in End-to-End Automatic Speech Recognition"
50 / 55 papers shown
Title
Transfer Learning-Based Deep Residual Learning for Speech Recognition in Clean and Noisy Environments
Noussaiba Djeffal
Djamel Addou
Hamza Kheddar
Sid Ahmed Selouani
28
1
0
02 May 2025
Towards Hardware Supported Domain Generalization in DNN-Based Edge Computing Devices for Health Monitoring
Johnson Loh
Lyubov Dudchenko
Justus Viga
T. Gemmeke
57
0
0
12 Mar 2025
Retrieval-Augmented Speech Recognition Approach for Domain Challenges
Peng Shen
Xugang Lu
Hisashi Kawai
RALM
60
0
0
24 Feb 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
31
0
0
06 Feb 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Erik Cambria
LM&MA
AILaw
93
153
0
28 Jan 2025
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
Kai-Tuo Xu
Feng-Long Xie
Xu Tang
Yao Hu
69
4
0
24 Jan 2025
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding
J. Hu
Zuchao Li
Mengjia Shen
Haojun Ai
Sheng Li
Jun Zhang
31
0
0
20 Jan 2025
Uncovering the Visual Contribution in Audio-Visual Speech Recognition
Zhaofeng Lin
Naomi Harte
78
1
0
20 Jan 2025
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Hao Shi
Yuan Gao
Zhaoheng Ni
Tatsuya Kawahara
30
1
0
01 Sep 2024
Advancing Multi-talker ASR Performance with Large Language Models
Mohan Shi
Zengrui Jin
Yaoxun Xu
Yong Xu
Shi-Xiong Zhang
Kun Wei
Yiwen Shao
Chunlei Zhang
Dong Yu
29
0
0
30 Aug 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
36
2
0
09 Jun 2024
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
Zheshu Song
Jianheng Zhuo
Yifan Yang
Ziyang Ma
Shixiong Zhang
Xie Chen
29
9
0
07 Jun 2024
ecVoice: Audio Text Extraction and Optimization of Video Based on Idioms Similarity Replacement
Jinwei Lin
39
0
0
20 May 2024
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition
Peng Shen
Xugang Lu
Hisashi Kawai
25
1
0
18 Dec 2023
Massive End-to-end Models for Short Search Queries
Weiran Wang
Rohit Prabhavalkar
Dongseong Hwang
Qiujia Li
K. Sim
...
Zhong Meng
CJ Zheng
Yanzhang He
Tara N. Sainath
P. M. Mengibar
22
2
0
22 Sep 2023
Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling
Zheng Nan
T. Dang
V. Sethu
Beena Ahmed
BDL
13
2
0
21 Sep 2023
Semi-Autoregressive Streaming ASR With Label Context
Siddhant Arora
G. Saon
Shinji Watanabe
Brian Kingsbury
AI4TS
23
5
0
19 Sep 2023
Bornil: An open-source sign language data crowdsourcing platform for AI enabled dialect-agnostic communication
Shahriar Elahi Dhruvo
Mohammad Akhlaqur Rahman
M. Mandal
Md. Istiak Hossain Shihab
A. A. N. Ansary
...
Sejuti Rahman
Sayma Sultana Chowdhury
Sabbir Ahmed Chowdhury
Farig Sadeque
Asif Sushmit
18
1
0
29 Aug 2023
Improving Continuous Sign Language Recognition with Cross-Lingual Signs
Fangyun Wei
Yutong Chen
SLR
20
28
0
21 Aug 2023
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Jinchuan Tian
Jianwei Yu
Hangting Chen
Brian Yan
Chao Weng
Dong Yu
Shinji Watanabe
28
1
0
19 Aug 2023
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Hanjing Zhu
Dongji Gao
Gaofeng Cheng
Daniel Povey
Pengyuan Zhang
Yonghong Yan
NoLa
25
4
0
12 Aug 2023
Timestamped Embedding-Matching Acoustic-to-Word CTC ASR
Woojay Jeon
19
0
0
20 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
26
9
0
18 Jun 2023
Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Lu Huang
B. Li
Jun Zhang
Lu Lu
Zejun Ma
21
2
0
07 Jun 2023
Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator
Lingwei Meng
Jiawen Kang
Mingyu Cui
Haibin Wu
Xixin Wu
Helen M. Meng
31
10
0
25 May 2023
Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
35
5
0
19 May 2023
Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Eric Sun
Jinyu Li
Yuxuan Hu
Yilun Zhu
Long Zhou
...
Peidong Wang
Linquan Liu
Shujie Liu
Ed Lin
Yifan Gong
26
6
0
01 Mar 2023
Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Vladimir Bataev
Roman Korostik
Evgeny Shabalin
Vitaly Lavrukhin
Boris Ginsburg
VLM
23
14
0
27 Feb 2023
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Jiajun Deng
Xurong Xie
Tianzi Wang
Mingyu Cui
Boyang Xue
Zengrui Jin
Guinan Li
Shujie Hu
Xunying Liu
26
5
0
15 Feb 2023
A Weakly-Supervised Streaming Multilingual Speech Model with Truly Zero-Shot Capability
Jian Xue
Peidong Wang
Jinyu Li
Eric Sun
19
10
0
04 Nov 2022
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
P. Swietojanski
Stefan Braun
Dogan Can
Thiago Fraga da Silva
Arnab Ghoshal
...
Henry Mason
Erik McDermott
Honza Silovsky
R. Travadi
Xiaodan Zhuang
20
13
0
02 Nov 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
19
2
0
31 Oct 2022
Can Visual Context Improve Automatic Speech Recognition for an Embodied Agent?
Pradip Pramanick
Chayan Sarkar
11
7
0
21 Oct 2022
Towards Personalization of CTC Speech Recognition Models with Contextual Adapters and Adaptive Boosting
Saket Dingliwal
Monica Sunkara
S. Bodapati
S. Ronanki
Jeffrey J. Farris
Katrin Kirchhoff
25
0
0
18 Oct 2022
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
13
16
0
12 Sep 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
11
3
0
29 Jul 2022
Improving Deliberation by Text-Only and Semi-Supervised Training
Ke Hu
Tara N. Sainath
Yanzhang He
Rohit Prabhavalkar
Trevor Strohman
S. Mavandadi
Weiran Wang
19
12
0
29 Jun 2022
Mask scalar prediction for improving robust automatic speech recognition
A. Narayanan
James Walker
S. Panchapagesan
N. Howard
Yuma Koizumi
11
4
0
26 Apr 2022
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
19
26
0
11 Apr 2022
Transducer-based language embedding for spoken language identification
Peng Shen
Xugang Lu
Hisashi Kawai
42
6
0
08 Apr 2022
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Junyi Ao
Zi-Hua Zhang
Long Zhou
Shujie Liu
Haizhou Li
Tom Ko
Lirong Dai
Jinyu Li
Yao Qian
Furu Wei
SSL
11
19
0
31 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
13
93
0
29 Mar 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Fan Yu
Shiliang Zhang
Pengcheng Guo
Yihui Fu
Zhihao Du
...
Kong Aik Lee
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
13
28
0
08 Feb 2022
Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech Recognition
Rong Gong
Carl Quillen
D. Sharma
Andrew Goderre
José Laínez
Ljubomir Milanović
24
13
0
10 Sep 2021
A Configurable Multilingual Model is All You Need to Recognize All Languages
Long Zhou
Jinyu Li
Eric Sun
Shujie Liu
92
40
0
13 Jul 2021
Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition
Yosuke Kashiwagi
E. Tsunoo
Shinji Watanabe
AI4TS
19
7
0
18 Feb 2021
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
79
224
0
12 Feb 2021
Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
Naoyuki Kanda
Yashesh Gaur
S. Parthasarathy
Eric Sun
Liang Lu
Xie Chen
Jinyu Li
Y. Gong
AuLLM
23
52
0
02 Feb 2021
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
41
35
0
12 Dec 2020
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data
Thibault Doutre
Wei Han
Min Ma
Zhiyun Lu
Chung-Cheng Chiu
Ruoming Pang
A. Narayanan
Ananya Misra
Yu Zhang
Liangliang Cao
52
22
0
22 Oct 2020
1
2
Next