Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1709.05522
Cited By
AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
16 September 2017
Hui Bu
Jiayu Du
Xingyu Na
Bengu Wu
Hao Zheng
CVBM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline"
50 / 451 papers shown
Large Language Model Should Understand Pinyin for Chinese ASR Error Correction
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Yuang Li
Xiaosong Qiao
Xiaofeng Zhao
Huan Zhao
Wei Tang
Min Zhang
Hao Yang
168
10
0
20 Sep 2024
A quest through interconnected datasets: lessons from highly-cited ICASSP papers
International Conference on Content-Based Multimedia Indexing (CBMI), 2024
Cynthia C. S. Liem
Doğa Taşcılar
Andrew M. Demetriou
191
0
0
19 Sep 2024
NDVQ: Robust Neural Audio Codec with Normal Distribution-Based Vector Quantization
Spoken Language Technology Workshop (SLT), 2024
Zhikang Niu
Sanyuan Chen
Long Zhou
Ziyang Ma
Xie Chen
Shujie Liu
119
4
0
19 Sep 2024
Optimizing Dysarthria Wake-Up Word Spotting: An End-to-End Approach for SLT 2024 LRDWWS Challenge
Spoken Language Technology Workshop (SLT), 2024
Shuiyun Liu
Yuxiang Kong
Pengcheng Guo
Weiji Zhuang
Peng Gao
Yujun Wang
Lei Xie
295
1
0
16 Sep 2024
ASR Error Correction using Large Language Models
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
KELM
300
21
0
14 Sep 2024
Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xiaoyu Liu
Xu Li
Joan Serrà
Santiago Pascual
253
5
0
14 Sep 2024
DualSep: A Light-weight dual-encoder convolutional recurrent network for real-time in-car speech separation
Spoken Language Technology Workshop (SLT), 2024
Ziqian Wang
Jiayao Sun
Zihan Zhang
Xingchen Li
Jie Liu
Lei Xie
238
3
0
13 Sep 2024
LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation
Shaojun Li
Hengchao Shang
Daimeng Wei
Jiaxin Guo
Zongyao Li
Xianghui He
Min Zhang
Hao Yang
277
6
0
13 Sep 2024
An Effective Context-Balanced Adaptation Approach for Long-Tailed Speech Recognition
Spoken Language Technology Workshop (SLT), 2024
Yi-Cheng Wang
Li-Ting Pai
Bi-Cheng Yan
Hsin-Wei Wang
Chi-Han Lin
Berlin Chen
178
2
0
10 Sep 2024
VoiceWukong: Benchmarking Deepfake Voice Detection
Ziwei Yan
Yanjie Zhao
Haoyu Wang
331
4
0
10 Sep 2024
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Spoken Language Technology Workshop (SLT), 2024
Hongfei Xue
Rong Gong
Mingchen Shao
Xin Xu
L. xilinx Wang
...
Yong Qin
Jun Du
Ming Li
Binbin Zhang
Bin Jia
182
5
0
09 Sep 2024
Lightweight Transducer Based on Frame-Level Criterion
Interspeech (Interspeech), 2024
Genshun Wan
Mengzhi Wang
Tingzhi Mao
Hang Chen
Z. Ye
233
1
0
05 Sep 2024
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Interspeech (Interspeech), 2024
Zengrui Jin
Yifan Yang
Mohan Shi
Wei Kang
Xiaoyu Yang
...
Lingwei Meng
Long Lin
Yong Xu
Shi-Xiong Zhang
Daniel Povey
192
6
0
01 Sep 2024
Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper
Interspeech (Interspeech), 2024
Tianyi Xu
Kaixun Huang
Pengcheng Guo
Can Ma
Longtao Huang
Hui Xue
Lei Xie
CLL
150
11
0
20 Aug 2024
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition
Interspeech (Interspeech), 2024
Yangze Li
Xiong Wang
Songjun Cao
Yike Zhang
Long Ma
Lei Xie
AuLLM
218
8
0
18 Aug 2024
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Jiangyan Yi
Chu Yuan Zhang
Jianhua Tao
Chenglong Wang
Xinrui Yan
Yong Ren
Hao Gu
Junzuo Zhou
266
13
0
09 Aug 2024
Survey: Transformer-based Models in Data Modality Conversion
Elyas Rashno
Amir Eskandari
Aman Anand
F. Zulkernine
MedIm
225
6
0
08 Aug 2024
MulliVC: Multi-lingual Voice Conversion With Cycle Consistency
Jiawei Huang
Chen Zhang
Yi Ren
Ziyue Jiang
Zhenhui Ye
Jinglin Liu
Jinzheng He
Xiang Yin
Zhou Zhao
114
4
0
08 Aug 2024
HydraFormer: One Encoder For All Subsampling Rates
IEEE International Conference on Multimedia and Expo (ICME), 2024
Yaoxun Xu
Xingchen Song
Zhiyong Wu
Di Wu
Zhendong Peng
Binbin Zhang
241
1
0
08 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
341
23
0
21 Jul 2024
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR
Wenbo Zhao
Ziwei Li
Chuan Yu
Zhijian Ou
AI4TS
254
3
0
14 Jul 2024
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Ye Bai
Jingping Chen
Jitong Chen
Wei Chen
Zhuo Chen
...
Wanyi Zhang
Yang Zhang
Yawei Zhang
Yijie Zheng
Ming Zou
AuLLM
358
68
0
05 Jul 2024
Romanization Encoding For Multilingual ASR
Wen Ding
Fei Jia
Hainan Xu
Yu Xi
Junjie Lai
Boris Ginsburg
209
1
0
05 Jul 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
Keyu An
Qian Chen
Chong Deng
Zhihao Du
Changfeng Gao
...
Bin Zhang
Qinglin Zhang
Shiliang Zhang
Nan Zhao
Siqi Zheng
AuLLM
407
109
0
04 Jul 2024
Multi-Convformer: Extending Conformer with Multiple Convolution Kernels
Darshan Prabhu
Yifan Peng
Preethi Jyothi
Shinji Watanabe
242
4
0
04 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
326
44
0
30 Jun 2024
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition
Yuchun Shu
Bo Hu
Yifeng He
Hao Shi
Longbiao Wang
Jianwu Dang
227
4
0
29 Jun 2024
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Peikun Chen
Sining Sun
Changhao Shan
Qing Yang
Lei Xie
283
6
0
27 Jun 2024
MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research
Song Li
Yongbin You
Xuezhi Wang
Zhengkun Tian
Ke Ding
Guanglu Wan
196
11
0
26 Jun 2024
Exploring the Capability of Mamba in Speech Applications
Koichi Miyazaki
Yoshiki Masuyama
Masato Murata
Mamba
300
29
0
24 Jun 2024
Revisiting Interpolation Augmentation for Speech-to-Text Generation
Chen Xu
Jie Wang
Xiaoqian Liu
Qianqian Dong
Chunliang Zhang
Tong Xiao
Jingbo Zhu
Dapeng Man
Wu Yang
184
1
0
22 Jun 2024
Transferable speech-to-text large language model alignment module
Interspeech (Interspeech), 2024
Boyong Wu
Chao Yan
Haoran Pu
141
0
0
19 Jun 2024
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
Yifan Yang
Zheshu Song
Jianheng Zhuo
Mingyu Cui
Jinpeng Li
...
Shuai Fan
Kai Yu
Wei Zhang
Guoguo Chen
Xie Chen
509
32
0
17 Jun 2024
Robust Channel Learning for Large-Scale Radio Speaker Verification
Wenhao Yang
Jianguo Wei
Wenhuan Lu
Lei Li
Xugang Lu
208
3
0
16 Jun 2024
An efficient text augmentation approach for contextualized Mandarin speech recognition
Interspeech (Interspeech), 2024
Naijun Zheng
Xucheng Wan
Kai Liu
Ziqing Du
Zhou Huan
179
2
0
14 Jun 2024
Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design
Interspeech (Interspeech), 2024
Ming Gao
Hang Chen
Jun Du
Xin Xu
Hongxiao Guo
Hui Bu
Jianxing Yang
Ming Li
Chin-Hui Lee
257
5
0
14 Jun 2024
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models
Jinchuan Tian
Yifan Peng
William Chen
Kwanghee Choi
Karen Livescu
Shinji Watanabe
192
12
0
13 Jun 2024
ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis
Dehua Tao
Daxin Tan
Y. Yeung
Xiao Chen
Tan Lee
201
6
0
13 Jun 2024
PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models
Runyan Yang
Huibao Yang
Xiqing Zhang
Tiantian Ye
Ying Liu
Yingying Gao
Shilei Zhang
Chao Deng
Junlan Feng
231
3
0
12 Jun 2024
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection
Rong Gong
Hongfei Xue
L. xilinx Wang
Xin Xu
Qisheng Li
...
Yong Qin
Binbin Zhang
Jun Du
Jia Bin
Ming Li
217
14
0
11 Jun 2024
mHuBERT-147: A Compact Multilingual HuBERT Model
Marcely Zanon Boito
Vivek Iyer
Nikolaos Lagos
Laurent Besacier
Ioan Calapodescu
VLM
443
57
0
10 Jun 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
320
5
0
09 Jun 2024
Pitch-Aware RNN-T for Mandarin Chinese Mispronunciation Detection and Diagnosis
Interspeech (Interspeech), 2024
Xintong Wang
Mingqian Shi
Ye Wang
140
0
0
07 Jun 2024
MaskSR: Masked Language Model for Full-band Speech Restoration
Xu Li
Qirui Wang
Xiaoyu Liu
234
29
0
04 Jun 2024
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2024
Xuelong Geng
Tianyi Xu
Kun Wei
Bingshen Mu
Hongfei Xue
...
Pengcheng Guo
Yuhang Dai
Longhao Li
Mingchen Shao
Lei Xie
316
20
0
03 May 2024
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization
Jianzong Wang
Ziqi Liang
Xulong Zhang
Ning Cheng
Jing Xiao
177
1
0
30 Apr 2024
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
Ye Bai
Chenxing Li
Hao Li
Yuanyuan Zhao
Xiaorui Wang
200
2
0
17 Apr 2024
DANCER: Entity Description Augmented Named Entity Corrector for Automatic Speech Recognition
Yi-Cheng Wang
Hsin-Wei Wang
Bi-Cheng Yan
Chi-Han Lin
Berlin Chen
203
2
0
26 Mar 2024
Encoding of lexical tone in self-supervised models of spoken language
Gaofei Shen
Michaela Watkins
Afra Alishahi
Arianna Bisazza
Grzegorz Chrupala
291
15
0
25 Mar 2024
Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
IEEE International Conference on Multimedia and Expo (ICME), 2024
Wenjing Zhu
Sining Sun
Changhao Shan
Peng Fan
Qing Yang
231
3
0
13 Mar 2024
Previous
1
2
3
4
5
6
...
8
9
10
Next