ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.05522
  4. Cited By
AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech
  Recognition Baseline

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

16 September 2017
Hui Bu
Jiayu Du
Xingyu Na
Bengu Wu
Hao Zheng
    CVBM
ArXiv (abs)PDFHTML

Papers citing "AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline"

50 / 451 papers shown
TST: Time-Sparse Transducer for Automatic Speech Recognition
TST: Time-Sparse Transducer for Automatic Speech RecognitionCAAI International Conference on Artificial Intelligence (ICCAI), 2023
Xiaohui Zhang
Mangui Liang
Zhengkun Tian
Jiangyan Yi
Jianhua Tao
113
0
0
17 Jul 2023
Exploring the Integration of Large Language Models into Automatic Speech
  Recognition Systems: An Empirical Study
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical StudyInternational Conference on Neural Information Processing (ICONIP), 2023
Zeping Min
Jinbo Wang
AuLLM
197
19
0
13 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and UnderstandingInterspeech (Interspeech), 2023
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
234
12
0
12 Jul 2023
Language-Routing Mixture of Experts for Multilingual and Code-Switching
  Speech Recognition
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech RecognitionInterspeech (Interspeech), 2023
Wenxuan Wang
Guodong Ma
Yuke Li
Binbin Du
MoE
263
41
0
12 Jul 2023
Enrollment-stage Backdoor Attacks on Speaker Recognition Systems via
  Adversarial Ultrasound
Enrollment-stage Backdoor Attacks on Speaker Recognition Systems via Adversarial UltrasoundIEEE Internet of Things Journal (IEEE IoT J.), 2023
Xinfeng Li
Junning Ze
Chen Yan
Yushi Cheng
Xiaoyu Ji
Wei Dong
AAML
196
14
0
28 Jun 2023
A Survey on Multimodal Large Language Models
A Survey on Multimodal Large Language ModelsNational Science Review (NSR), 2023
Xinglong Mao
Chaoyou Fu
Zhengye Zhang
Ke Li
Xing Sun
Tong Xu
Enhong Chen
MLLMLRM
458
995
0
23 Jun 2023
Multi-pass Training and Cross-information Fusion for Low-resource
  End-to-end Accented Speech Recognition
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech RecognitionInterspeech (Interspeech), 2023
Xuefei Wang
Yanhua Long
Yijie Li
Haoran Wei
189
5
0
20 Jun 2023
Research on an improved Conformer end-to-end Speech Recognition Model
  with R-Drop Structure
Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Weidong Ji
Shijie Zan
Guohui Zhou
Xu Wang
SyDa
186
1
0
14 Jun 2023
MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with
  Depth Information
MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth InformationInterspeech (Interspeech), 2023
Jianrong Wang
Yuchen Huo
Li Liu
Tianyi Xu
Qi Li
Sen Li
155
3
0
04 Jun 2023
Acoustic Word Embeddings for Untranscribed Target Languages with
  Continued Pretraining and Learned Pooling
Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned PoolingInterspeech (Interspeech), 2023
Ramon Sanabria
Ondˇrej Klejch
Hao Tang
Sharon Goldwater
138
4
0
03 Jun 2023
Enhancing the Unified Streaming and Non-streaming Model with Contrastive
  Learning
Enhancing the Unified Streaming and Non-streaming Model with Contrastive LearningInterspeech (Interspeech), 2023
Yuting Yang
Yuke Li
Binbin Du
AI4TS
162
1
0
01 Jun 2023
Spoken Language Identification System for English-Mandarin
  Code-Switching Child-Directed Speech
Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed SpeechInterspeech (Interspeech), 2023
Shashi Kant Gupta
Sushant Hiray
Prashant Kukde
190
5
0
01 Jun 2023
VILAS: Exploring the Effects of Vision and Language Context in Automatic
  Speech Recognition
VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ziyi Ni
Minglun Han
Feilong Chen
Linghui Meng
Jing Shi
Shuang Xu
Bo Xu
186
3
0
31 May 2023
Simple yet Effective Code-Switching Language Identification with
  Multitask Pre-Training and Transfer Learning
Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning
Shuyue Stella Li
Cihan Xiao
Tianjian Li
Bismarck Odoom
124
4
0
31 May 2023
Perception and Semantic Aware Regularization for Sequential Confidence
  Calibration
Perception and Semantic Aware Regularization for Sequential Confidence CalibrationComputer Vision and Pattern Recognition (CVPR), 2023
Zhenghua Peng
Yuanmao Luo
Tianshui Chen
Keke Xu
Shuangping Huang
AI4TS
289
4
0
31 May 2023
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial
  Attack in Speaker Identification
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker IdentificationInterspeech (Interspeech), 2023
Qing Wang
Jixun Yao
Ziqian Wang
Pengcheng Guo
Linfu Xie
AAML
149
3
0
30 May 2023
Investigating model performance in language identification: beyond
  simple error statistics
Investigating model performance in language identification: beyond simple error statisticsInterspeech (Interspeech), 2023
S. Styles
Victoria Y. H. Chua
Fei Ting Woon
Hexin Liu
Leibny Paola García Perera
Sanjeev Khudanpur
Andy W. H. Khong
Justin Dauwels
127
4
0
30 May 2023
Speaker anonymization using orthogonal Householder neural network
Speaker anonymization using orthogonal Householder neural networkIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
BDL
132
37
0
30 May 2023
speech and noise dual-stream spectrogram refine network with speech
  distortion loss for robust speech recognition
speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Haoyu Lu
Nan Li
Tongtong Song
Longbiao Wang
Jianwu Dang
Xiaobao Wang
Shiliang Zhang
NoLa
179
5
0
29 May 2023
Bridging the Granularity Gap for Acoustic Modeling
Bridging the Granularity Gap for Acoustic ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Chen Xu
Yuhao Zhang
Chengbo Jiao
Xiaoqian Liu
Chi Hu
Xin Zeng
Tong Xiao
Anxiang Ma
Huizhen Wang
JingBo Zhu
247
6
0
27 May 2023
DistriBlock: Identifying adversarial audio samples by leveraging
  characteristics of the output distribution
DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distributionConference on Uncertainty in Artificial Intelligence (UAI), 2023
Matías P. Pizarro
D. Kolossa
Asja Fischer
AAML
498
2
0
26 May 2023
InterFormer: Interactive Local and Global Features Fusion for Automatic
  Speech Recognition
InterFormer: Interactive Local and Global Features Fusion for Automatic Speech RecognitionInterspeech (Interspeech), 2023
Zhibing Lai
Tianren Zhang
Qi Liu
Xinyuan Qian
Li-Fang Wei
Songlu Chen
Feng Chen
Xu-Cheng Yin
124
5
0
24 May 2023
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic
  and Semantic Cooperative Decoding
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative DecodingInterspeech (Interspeech), 2023
Tianren Zhang
Haibo Qin
Zhibing Lai
Songlu Chen
Qi Liu
Feng Chen
Xinyuan Qian
Xu-Cheng Yin
119
1
0
23 May 2023
ADD 2023: the Second Audio Deepfake Detection Challenge
ADD 2023: the Second Audio Deepfake Detection Challenge
Jiangyan Yi
Jianhua Tao
Ruibo Fu
Xinrui Yan
Chenglong Wang
...
Zhengqi Wen
Shan Liang
Zheng Lian
Shuai Nie
Haizhou Li
280
147
0
23 May 2023
CopyNE: Better Contextual ASR by Copying Named Entities
CopyNE: Better Contextual ASR by Copying Named EntitiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shilin Zhou
Zhenghua Li
Yu Hong
Hao Fei
Zhefeng Wang
Baoxing Huai
275
13
0
22 May 2023
GNCformer Enhanced Self-attention for Automatic Speech Recognition
GNCformer Enhanced Self-attention for Automatic Speech Recognition
Junlong Li
Z. Duan
S. Li
X. Yu
G. Yang
141
1
0
22 May 2023
Exploring Energy-based Language Models with Different Architectures and
  Training Methods for Speech Recognition
Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech RecognitionInterspeech (Interspeech), 2023
Hong Liu
Z. Lv
Zhijian Ou
Wenbo Zhao
Qing Xiao
210
1
0
22 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding TasksInterspeech (Interspeech), 2023
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
236
22
0
18 May 2023
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
FunASR: A Fundamental End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2023
Zhifu Gao
Zerui Li
Jiaming Wang
Haoneng Luo
Xian Shi
...
Yabin Li
Lingyun Zuo
Zhihao Du
Zhangyu Xiao
Shiliang Zhang
246
110
0
18 May 2023
A Lexical-aware Non-autoregressive Transformer-based ASR Model
A Lexical-aware Non-autoregressive Transformer-based ASR ModelInterspeech (Interspeech), 2023
Chong Lin
Kuan-Yu Chen
AI4TS
122
3
0
18 May 2023
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMsInterspeech (Interspeech), 2023
Xingcheng Song
Di Wu
Binbin Zhang
Zhendong Peng
Bo Dang
Fuping Pan
Zhiyong Wu
172
20
0
18 May 2023
X-LLM: Bootstrapping Advanced Large Language Models by Treating
  Multi-Modalities as Foreign Languages
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Feilong Chen
Minglun Han
Haozhi Zhao
Qingyang Zhang
Jing Shi
Shuang Xu
Bo Xu
MLLM
336
150
0
07 May 2023
Self-regularised Minimum Latency Training for Streaming
  Transformer-based Speech Recognition
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech RecognitionInterspeech (Interspeech), 2022
Mohan Li
R. Doddipatla
Catalin Zorila
228
0
0
24 Apr 2023
A CTC Alignment-based Non-autoregressive Transformer for End-to-end
  Automatic Speech Recognition
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Ruchao Fan
Wei Chu
Peng Chang
Abeer Alwan
173
18
0
15 Apr 2023
Sim-T: Simplify the Transformer Network by Multiplexing Technique for
  Speech Recognition
Sim-T: Simplify the Transformer Network by Multiplexing Technique for Speech Recognition
Guangyong Wei
Zhikui Duan
Shiren Li
Guangguang Yang
Xinmei Yu
Junhua Li
198
5
0
11 Apr 2023
TransAudio: Towards the Transferable Adversarial Audio Attack via
  Learning Contextualized Perturbations
TransAudio: Towards the Transferable Adversarial Audio Attack via Learning Contextualized PerturbationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Qin Gege
YueFeng Chen
Xiaofeng Mao
Yao Zhu
Binyuan Hui
Xiaodan Li
Rong Zhang
Hui Xue
AAML
222
9
0
28 Mar 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for
  Mandarin Speech Recognition
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
210
0
0
23 Mar 2023
Beyond Universal Transformer: block reusing with adaptor in Transformer
  for automatic speech recognition
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognitionInternational Symposium on Neural Networks (ISNN), 2023
Haoyu Tang
Zhaoyi Liu
Chang Zeng
Xinfeng Li
227
1
0
23 Mar 2023
Exploring Representation Learning for Small-Footprint Keyword Spotting
Exploring Representation Learning for Small-Footprint Keyword SpottingInterspeech (Interspeech), 2022
Fan Cui
Liyong Guo
Quandong Wang
Peng Gao
Yujun Wang
SSL
161
4
0
20 Mar 2023
Context-Aware Selective Label Smoothing for Calibrating Sequence
  Recognition Model
Context-Aware Selective Label Smoothing for Calibrating Sequence Recognition ModelACM Multimedia (MM), 2021
Shuangping Huang
Y. Luo
Zhenzhou Zhuang
Jin-Gang Yu
Mengchao He
Yongpan Wang
200
10
0
13 Mar 2023
The System Description of dun_oscar team for The ICPR MSR Challenge
The System Description of dun_oscar team for The ICPR MSR Challenge
Binbin Du
Rui Deng
Yingxin Zhang
136
0
0
13 Mar 2023
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech
  Recognizers via Hierarchical Distillation
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical DistillationInterspeech (Interspeech), 2023
Minglun Han
Feilong Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
216
17
0
30 Jan 2023
Acoustic correlates of the syllabic rhythm of speech: Modulation
  spectrum or local features of the temporal envelope
Acoustic correlates of the syllabic rhythm of speech: Modulation spectrum or local features of the temporal envelopeNeuroscience and Biobehavioral Reviews (NBR), 2022
Yuran Zhang
Jiajie Zou
Nai Ding
76
9
0
14 Jan 2023
Learning to Detect Noisy Labels Using Model-Based Features
Learning to Detect Noisy Labels Using Model-Based FeaturesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zhihao Wang
Zongyu Lin
Peiqi Liu
Guidong Zheng
Jun-Hao Wen
Xianxin Chen
Yujun Chen
Zhilin Yang
NoLa
238
6
0
28 Dec 2022
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist
  Models
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
Jinze Bai
Rui Men
Han Yang
Xuancheng Ren
Kai Dang
...
Wenhang Ge
Jianxin Ma
Junyang Lin
Jingren Zhou
Chang Zhou
144
19
0
08 Dec 2022
SoftCorrect: Error Correction with Soft Detection for Automatic Speech
  Recognition
SoftCorrect: Error Correction with Soft Detection for Automatic Speech RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2022
Yichong Leng
Xu Tan
Wenjie Liu
Kaitao Song
Rui Wang
Xiang-Yang Li
Tao Qin
Ed Lin
Tie-Yan Liu
243
19
0
02 Dec 2022
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech
  Recognition
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech RecognitionInterspeech (Interspeech), 2022
Xiaohuan Zhou
Jiaming Wang
Zeyu Cui
Shiliang Zhang
Zhijie Yan
Jingren Zhou
Chang Zhou
233
13
0
29 Nov 2022
Model Extraction Attack against Self-supervised Speech Models
Model Extraction Attack against Self-supervised Speech Models
Tsung-Yuan Hsu
Chen-An Li
Tung-Yu Wu
Hung-yi Lee
189
1
0
29 Nov 2022
A new Speech Feature Fusion method with cross gate parallel CNN for
  Speaker Recognition
A new Speech Feature Fusion method with cross gate parallel CNN for Speaker Recognition
Jiacheng Zhang
Wenyi Yan
Ye Zhang
74
3
0
24 Nov 2022
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error
  Correction
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error CorrectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kai Shen
Yichong Leng
Xuejiao Tan
Si-Qi Tang
Yuan Zhang
Wenjie Liu
Ed Lin
132
16
0
23 Nov 2022
Previous
123456...8910
Next