Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1709.05522
Cited By
AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
16 September 2017
Hui Bu
Jiayu Du
Xingyu Na
Bengu Wu
Hao Zheng
CVBM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline"
50 / 451 papers shown
SSCFormer: Push the Limit of Chunk-wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal Convolution
IEEE Signal Processing Letters (SPL), 2022
Fangyuan Wang
Bo Xu
Bo Xu
326
0
0
21 Nov 2022
Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yu Chen
Wen Ding
Junjie Lai
267
11
0
09 Nov 2022
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Ao Zhang
F. Yu
Kaixun Huang
Linfu Xie
Longbiao Wang
Eng Siong Chng
Hui Bu
Binbin Zhang
Wei Chen
Xin Xu
200
5
0
03 Nov 2022
Towards Zero-Shot Code-Switched Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Brian Yan
Sanjeev Khudanpur
Ondˇrej Klejch
Preethi Jyothi
Shinji Watanabe
223
24
0
02 Nov 2022
Monolingual Recognizers Fusion for Code-switching Speech Recognition
Tongtong Song
Qiang Xu
Haoyu Lu
Longbiao Wang
Hao Shi
Yuqin Lin
Yanbing Yang
Jianwu Dang
149
4
0
02 Nov 2022
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
337
16
0
02 Nov 2022
TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xingcheng Song
Di Wu
Zhiyong Wu
Binbin Zhang
Yuekai Zhang
Zhendong Peng
Wenpeng Li
Fuping Pan
Changbao Zhu
235
12
0
01 Nov 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
133
2
0
31 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
254
31
0
29 Oct 2022
SAN: a robust end-to-end ASR model architecture
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zeping Min
Qian Ge
Guanhua Huang
123
2
0
27 Oct 2022
V-Cloak: Intelligibility-, Naturalness- & Timbre-Preserving Real-Time Voice Anonymization
Jiangyi Deng
Fei Teng
Yanjiao Chen
Xiaofu Chen
Zhaohui Wang
Wenyuan Xu
200
33
0
27 Oct 2022
Pronunciation Generation for Foreign Language Words in Intra-Sentential Code-Switching Speech Recognition
Wei Wang
Chao Zhang
Xiao-pei Wu
79
0
0
26 Oct 2022
10 hours data is all you need
Zeping Min
Qian Ge
Zhong Li
174
3
0
24 Oct 2022
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement
Spoken Language Technology Workshop (SLT), 2022
Shubo Lv
Yihui Fu
Yukai Jv
Linfu Xie
Weixin Zhu
Wei Rao
Yannan Wang
152
11
0
17 Oct 2022
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Ruchao Fan
Guoli Ye
Yashesh Gaur
Jinyu Li
183
4
0
16 Oct 2022
LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Yan Jia
Mihee Hong
Jingyu Hou
Kailong Ren
Sifan Ma
Jin Wang
Fangzhen Peng
Yinglin Ji
Lin Yang
Junjie Wang
176
1
0
14 Oct 2022
An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Chao-Han Huck Yang
Jun Qi
Sabato Marco Siniscalchi
Chin-Hui Lee
180
4
0
12 Oct 2022
A context-aware knowledge transferring strategy for CTC-based ASR
Spoken Language Technology Workshop (SLT), 2022
Keda Lu
Kuan-Yu Chen
166
24
0
12 Oct 2022
PSVRF: Learning to restore Pitch-Shifted Voice without reference
Yangfu Li
Xiaodan Lin
Jiaxin Yang
129
0
0
06 Oct 2022
Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
International Conference on Data Science and Advanced Analytics (DSAA), 2022
Chendong Zhao
Jianzong Wang
Wentao Wei
Xiaoyang Qu
Haoqian Wang
Jing Xiao
170
2
0
30 Sep 2022
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Interspeech (Interspeech), 2022
Ye Bai
Jie Li
W. Han
Hao Ni
Kaituo Xu
Zhuo Zhang
Cheng Yi
Xiaorui Wang
MoE
140
3
0
17 Sep 2022
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Spoken Language Technology Workshop (SLT), 2022
Peng Shen
Xugang Lu
Hisashi Kawai
106
2
0
29 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
230
12
0
24 Jul 2022
Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition
Xun Gong
Zhikai Zhou
Y. Qian
225
6
0
15 Jul 2022
Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion
Jianchun Ma
Zhedong Zheng
Hao Fei
Feng Zheng
Tat-Seng Chua
Yi Yang
GAN
161
0
0
13 Jul 2022
CFAD: A Chinese Dataset for Fake Audio Detection
Speech Communication (Speech Commun.), 2022
Haoxin Ma
Jiangyan Yi
Chenglong Wang
Xin Yan
Jianhua Tao
Tao Wang
Shiming Wang
Ruibo Fu
174
51
0
12 Jul 2022
The HCCL System for the NIST SRE21
Interspeech (Interspeech), 2022
Zhuo Li
Runqiu Xiao
Hangting Chen
Zhenduo Zhao
Zi-qiang Zhang
Wenchao Wang
111
0
0
11 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
International Conference on Machine Learning (ICML), 2022
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
265
191
0
06 Jul 2022
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Interspeech (Interspeech), 2022
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
Jianwu Dang
194
35
0
29 Jun 2022
TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Interspeech (Interspeech), 2022
Chengfei Li
Shuhao Deng
Yaoping Wang
Guangjing Wang
Y. Gong
Changbin Chen
Jinfeng Bai
142
24
0
27 Jun 2022
SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Interspeech (Interspeech), 2022
Zuheng Kang
Junqing Peng
Jianzong Wang
Jing Xiao
155
7
0
27 Jun 2022
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
Interspeech (Interspeech), 2022
Zhifu Gao
Shiliang Zhang
Ian Mcloughlin
Zhijie Yan
235
179
0
16 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Interspeech (Interspeech), 2022
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
144
11
0
15 Jun 2022
Improving CTC-based ASR Models with Gated Interlayer Collaboration
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yuting Yang
Yuke Li
Binbin Du
318
15
0
25 May 2022
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Yuting Yang
Binbin Du
Yuke Li
329
2
0
24 May 2022
PERT: A New Solution to Pinyin to Character Conversion Task
Jinghui Xiao
Qun Liu
Xin Jiang
Yuanfeng Xiong
Haiteng Wu
Zhe Zhang
98
2
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
655
442
0
21 May 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
135
31
0
20 May 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Interspeech (Interspeech), 2022
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
232
19
0
18 May 2022
One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code
Yong Dai
Duyu Tang
Liangxin Liu
Minghuan Tan
Cong Zhou
Jingquan Wang
Zhangyin Feng
Fan Zhang
Xueyu Hu
Shuming Shi
VLM
MoE
160
32
0
12 May 2022
Heterogeneous Separation Consistency Training for Adaptation of Unsupervised Speech Separation
EURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process.), 2022
Jiangyu Han
Yanhua Long
133
7
0
23 Apr 2022
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction
Interspeech (Interspeech), 2022
Zifeng Zhao
Rongzhi Gu
Dongchao Yang
Jinchuan Tian
Yuexian Zou
132
2
0
15 Apr 2022
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qianying Liu
Zhuo Gong
Zhengdong Yang
Yuhang Yang
Sheng Li
...
Nobuaki Minematsu
Hao-Ming Huang
Fei Cheng
Chenhui Chu
Sadao Kurohashi
173
10
0
08 Apr 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion
Weida Liang
Lantian Li
Wenqiang Du
Dong Wang
280
0
0
08 Apr 2022
Alternate Intermediate Conditioning with Syllable-level and Character-level Targets for Japanese ASR
Spoken Language Technology Workshop (SLT), 2022
Yusuke Fujita
Tatsuya Komatsu
Yusuke Kida
210
3
0
01 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Interspeech (Interspeech), 2022
Jaesong Lee
Lukas Lee
Shinji Watanabe
285
8
0
31 Mar 2022
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Interspeech (Interspeech), 2022
Zehui Yang
Yifan Chen
Lei Luo
Runyan Yang
Lingxuan Ye
...
Yaohui Jin
Qingqing Zhang
Pengyuan Zhang
Lei Xie
Yonghong Yan
146
71
0
31 Mar 2022
An Empirical Study of Language Model Integration for Transducer based Speech Recognition
Interspeech (Interspeech), 2022
Huahuan Zheng
Keyu An
Zhijian Ou
Chen Huang
Ke Ding
Guanglu Wan
193
5
0
31 Mar 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Interspeech (Interspeech), 2022
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
159
21
0
31 Mar 2022
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Keyu An
Ji Xiao
Zhijian Ou
107
3
0
31 Mar 2022
Previous
1
2
3
...
10
5
6
7
8
9
Next