Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1709.05522
Cited By
AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
16 September 2017
Hui Bu
Jiayu Du
Xingyu Na
Bengu Wu
Hao Zheng
CVBM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline"
50 / 451 papers shown
TST: Time-Sparse Transducer for Automatic Speech Recognition
CAAI International Conference on Artificial Intelligence (ICCAI), 2023
Xiaohui Zhang
Mangui Liang
Zhengkun Tian
Jiangyan Yi
Jianhua Tao
113
0
0
17 Jul 2023
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
International Conference on Neural Information Processing (ICONIP), 2023
Zeping Min
Jinbo Wang
AuLLM
197
19
0
13 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Interspeech (Interspeech), 2023
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
234
12
0
12 Jul 2023
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Interspeech (Interspeech), 2023
Wenxuan Wang
Guodong Ma
Yuke Li
Binbin Du
MoE
263
41
0
12 Jul 2023
Enrollment-stage Backdoor Attacks on Speaker Recognition Systems via Adversarial Ultrasound
IEEE Internet of Things Journal (IEEE IoT J.), 2023
Xinfeng Li
Junning Ze
Chen Yan
Yushi Cheng
Xiaoyu Ji
Wei Dong
AAML
196
14
0
28 Jun 2023
A Survey on Multimodal Large Language Models
National Science Review (NSR), 2023
Xinglong Mao
Chaoyou Fu
Zhengye Zhang
Ke Li
Xing Sun
Tong Xu
Enhong Chen
MLLM
LRM
458
995
0
23 Jun 2023
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Interspeech (Interspeech), 2023
Xuefei Wang
Yanhua Long
Yijie Li
Haoran Wei
189
5
0
20 Jun 2023
Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Weidong Ji
Shijie Zan
Guohui Zhou
Xu Wang
SyDa
186
1
0
14 Jun 2023
MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information
Interspeech (Interspeech), 2023
Jianrong Wang
Yuchen Huo
Li Liu
Tianyi Xu
Qi Li
Sen Li
155
3
0
04 Jun 2023
Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling
Interspeech (Interspeech), 2023
Ramon Sanabria
Ondˇrej Klejch
Hao Tang
Sharon Goldwater
138
4
0
03 Jun 2023
Enhancing the Unified Streaming and Non-streaming Model with Contrastive Learning
Interspeech (Interspeech), 2023
Yuting Yang
Yuke Li
Binbin Du
AI4TS
162
1
0
01 Jun 2023
Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech
Interspeech (Interspeech), 2023
Shashi Kant Gupta
Sushant Hiray
Prashant Kukde
190
5
0
01 Jun 2023
VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ziyi Ni
Minglun Han
Feilong Chen
Linghui Meng
Jing Shi
Shuang Xu
Bo Xu
186
3
0
31 May 2023
Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning
Shuyue Stella Li
Cihan Xiao
Tianjian Li
Bismarck Odoom
124
4
0
31 May 2023
Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Computer Vision and Pattern Recognition (CVPR), 2023
Zhenghua Peng
Yuanmao Luo
Tianshui Chen
Keke Xu
Shuangping Huang
AI4TS
289
4
0
31 May 2023
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification
Interspeech (Interspeech), 2023
Qing Wang
Jixun Yao
Ziqian Wang
Pengcheng Guo
Linfu Xie
AAML
149
3
0
30 May 2023
Investigating model performance in language identification: beyond simple error statistics
Interspeech (Interspeech), 2023
S. Styles
Victoria Y. H. Chua
Fei Ting Woon
Hexin Liu
Leibny Paola García Perera
Sanjeev Khudanpur
Andy W. H. Khong
Justin Dauwels
127
4
0
30 May 2023
Speaker anonymization using orthogonal Householder neural network
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
BDL
132
37
0
30 May 2023
speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Haoyu Lu
Nan Li
Tongtong Song
Longbiao Wang
Jianwu Dang
Xiaobao Wang
Shiliang Zhang
NoLa
179
5
0
29 May 2023
Bridging the Granularity Gap for Acoustic Modeling
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Chen Xu
Yuhao Zhang
Chengbo Jiao
Xiaoqian Liu
Chi Hu
Xin Zeng
Tong Xiao
Anxiang Ma
Huizhen Wang
JingBo Zhu
247
6
0
27 May 2023
DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Matías P. Pizarro
D. Kolossa
Asja Fischer
AAML
498
2
0
26 May 2023
InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Interspeech (Interspeech), 2023
Zhibing Lai
Tianren Zhang
Qi Liu
Xinyuan Qian
Li-Fang Wei
Songlu Chen
Feng Chen
Xu-Cheng Yin
124
5
0
24 May 2023
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Interspeech (Interspeech), 2023
Tianren Zhang
Haibo Qin
Zhibing Lai
Songlu Chen
Qi Liu
Feng Chen
Xinyuan Qian
Xu-Cheng Yin
119
1
0
23 May 2023
ADD 2023: the Second Audio Deepfake Detection Challenge
Jiangyan Yi
Jianhua Tao
Ruibo Fu
Xinrui Yan
Chenglong Wang
...
Zhengqi Wen
Shan Liang
Zheng Lian
Shuai Nie
Haizhou Li
280
147
0
23 May 2023
CopyNE: Better Contextual ASR by Copying Named Entities
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Shilin Zhou
Zhenghua Li
Yu Hong
Hao Fei
Zhefeng Wang
Baoxing Huai
275
13
0
22 May 2023
GNCformer Enhanced Self-attention for Automatic Speech Recognition
Junlong Li
Z. Duan
S. Li
X. Yu
G. Yang
141
1
0
22 May 2023
Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition
Interspeech (Interspeech), 2023
Hong Liu
Z. Lv
Zhijian Ou
Wenbo Zhao
Qing Xiao
210
1
0
22 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Interspeech (Interspeech), 2023
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
236
22
0
18 May 2023
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Interspeech (Interspeech), 2023
Zhifu Gao
Zerui Li
Jiaming Wang
Haoneng Luo
Xian Shi
...
Yabin Li
Lingyun Zuo
Zhihao Du
Zhangyu Xiao
Shiliang Zhang
246
110
0
18 May 2023
A Lexical-aware Non-autoregressive Transformer-based ASR Model
Interspeech (Interspeech), 2023
Chong Lin
Kuan-Yu Chen
AI4TS
122
3
0
18 May 2023
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
Interspeech (Interspeech), 2023
Xingcheng Song
Di Wu
Binbin Zhang
Zhendong Peng
Bo Dang
Fuping Pan
Zhiyong Wu
172
20
0
18 May 2023
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Feilong Chen
Minglun Han
Haozhi Zhao
Qingyang Zhang
Jing Shi
Shuang Xu
Bo Xu
MLLM
336
150
0
07 May 2023
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Interspeech (Interspeech), 2022
Mohan Li
R. Doddipatla
Catalin Zorila
228
0
0
24 Apr 2023
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Ruchao Fan
Wei Chu
Peng Chang
Abeer Alwan
173
18
0
15 Apr 2023
Sim-T: Simplify the Transformer Network by Multiplexing Technique for Speech Recognition
Guangyong Wei
Zhikui Duan
Shiren Li
Guangguang Yang
Xinmei Yu
Junhua Li
198
5
0
11 Apr 2023
TransAudio: Towards the Transferable Adversarial Audio Attack via Learning Contextualized Perturbations
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Qin Gege
YueFeng Chen
Xiaofeng Mao
Yao Zhu
Binyuan Hui
Xiaodan Li
Rong Zhang
Hui Xue
AAML
222
9
0
28 Mar 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
210
0
0
23 Mar 2023
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition
International Symposium on Neural Networks (ISNN), 2023
Haoyu Tang
Zhaoyi Liu
Chang Zeng
Xinfeng Li
227
1
0
23 Mar 2023
Exploring Representation Learning for Small-Footprint Keyword Spotting
Interspeech (Interspeech), 2022
Fan Cui
Liyong Guo
Quandong Wang
Peng Gao
Yujun Wang
SSL
161
4
0
20 Mar 2023
Context-Aware Selective Label Smoothing for Calibrating Sequence Recognition Model
ACM Multimedia (MM), 2021
Shuangping Huang
Y. Luo
Zhenzhou Zhuang
Jin-Gang Yu
Mengchao He
Yongpan Wang
200
10
0
13 Mar 2023
The System Description of dun_oscar team for The ICPR MSR Challenge
Binbin Du
Rui Deng
Yingxin Zhang
136
0
0
13 Mar 2023
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
Interspeech (Interspeech), 2023
Minglun Han
Feilong Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
216
17
0
30 Jan 2023
Acoustic correlates of the syllabic rhythm of speech: Modulation spectrum or local features of the temporal envelope
Neuroscience and Biobehavioral Reviews (NBR), 2022
Yuran Zhang
Jiajie Zou
Nai Ding
76
9
0
14 Jan 2023
Learning to Detect Noisy Labels Using Model-Based Features
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zhihao Wang
Zongyu Lin
Peiqi Liu
Guidong Zheng
Jun-Hao Wen
Xianxin Chen
Yujun Chen
Zhilin Yang
NoLa
238
6
0
28 Dec 2022
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
Jinze Bai
Rui Men
Han Yang
Xuancheng Ren
Kai Dang
...
Wenhang Ge
Jianxin Ma
Junyang Lin
Jingren Zhou
Chang Zhou
144
19
0
08 Dec 2022
SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
AAAI Conference on Artificial Intelligence (AAAI), 2022
Yichong Leng
Xu Tan
Wenjie Liu
Kaitao Song
Rui Wang
Xiang-Yang Li
Tao Qin
Ed Lin
Tie-Yan Liu
243
19
0
02 Dec 2022
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition
Interspeech (Interspeech), 2022
Xiaohuan Zhou
Jiaming Wang
Zeyu Cui
Shiliang Zhang
Zhijie Yan
Jingren Zhou
Chang Zhou
233
13
0
29 Nov 2022
Model Extraction Attack against Self-supervised Speech Models
Tsung-Yuan Hsu
Chen-An Li
Tung-Yu Wu
Hung-yi Lee
189
1
0
29 Nov 2022
A new Speech Feature Fusion method with cross gate parallel CNN for Speaker Recognition
Jiacheng Zhang
Wenyi Yan
Ye Zhang
74
3
0
24 Nov 2022
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kai Shen
Yichong Leng
Xuejiao Tan
Si-Qi Tang
Yuan Zhang
Wenjie Liu
Ed Lin
132
16
0
23 Nov 2022
Previous
1
2
3
4
5
6
...
8
9
10
Next