Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1804.00015
Cited By
ESPnet: End-to-End Speech Processing Toolkit
30 March 2018
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
Y. Unno
Nelson Yalta
Jahn Heymann
Matthew Wiesner
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ESPnet: End-to-End Speech Processing Toolkit"
50 / 258 papers shown
Title
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation
Md. Akmal Haidar
Chao Xing
Mehdi Rezagholizadeh
21
7
0
17 Mar 2021
AISPEECH-SJTU accent identification system for the Accented English Speech Recognition Challenge
Houjun Huang
Xu Xiang
Yexin Yang
Rao Ma
Y. Qian
11
25
0
19 Feb 2021
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Abhilasha Ravichander
Siddharth Dalmia
Maria Ryskina
Florian Metze
Eduard H. Hovy
A. Black
ELM
21
32
0
16 Feb 2021
Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Aswin Shanmugam Subramanian
Chao Weng
Shinji Watanabe
Meng Yu
Dong Yu
18
78
0
16 Feb 2021
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Chengyi Wang
Yu-Huan Wu
Yao Qian
K. Kumatani
Shujie Liu
Furu Wei
Michael Zeng
Xuedong Huang
OT
SSL
32
112
0
19 Jan 2021
TaL: a synchronised multi-speaker corpus of ultrasound tongue imaging, audio, and lip videos
M. Ribeiro
Jennifer Sanger
Jingxuan Zhang
Aciel Eshky
A. Wrench
Korin Richmond
Steve Renals
LM&MA
11
33
0
19 Nov 2020
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines
Fan Yu
Zhuoyuan Yao
Xiong Wang
Keyu An
Lei Xie
Zhijian Ou
Bo Liu
Xiulin Li
Guanqiong Miao
15
20
0
13 Nov 2020
Surrogate Source Model Learning for Determined Source Separation
Robin Scheibler
M. Togami
20
22
0
11 Nov 2020
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
19
7
0
11 Nov 2020
Data Augmentation For Children's Speech Recognition -- The "Ethiopian" System For The SLT 2021 Children Speech Recognition Challenge
Guoguo Chen
Xingyu Na
Yongqing Wang
Zhiyong Yan
Junbo Zhang
Sifan Ma
Yujun Wang
27
19
0
09 Nov 2020
On the Usefulness of Self-Attention for Automatic Speech Recognition with Transformers
Shucong Zhang
Erfan Loweimi
P. Bell
Steve Renals
14
36
0
08 Nov 2020
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement
Daxin Tan
Tan Lee
19
21
0
08 Nov 2020
Recent Developments on ESPnet Toolkit Boosted by Conformer
Pengcheng Guo
Florian Boyer
Xuankai Chang
Tomoki Hayashi
Yosuke Higuchi
...
Jing Shi
Shinji Watanabe
Kun Wei
Wangyou Zhang
Yuekai Zhang
34
262
0
26 Oct 2020
Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment
Ethan A. Chi
Julian Salazar
Katrin Kirchhoff
AI4TS
17
51
0
24 Oct 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
T. Toda
BDL
39
37
0
23 Oct 2020
Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions
Ludwig Kurzinger
Nicolas Lindae
Palle Klewitz
Gerhard Rigoll
19
5
0
15 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Junwen Bai
Weiran Wang
Yingbo Zhou
Caiming Xiong
SSL
AI4TS
27
12
0
07 Oct 2020
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation
Zhong-Qiu Wang
Peidong Wang
DeLiang Wang
17
88
0
04 Oct 2020
A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
Yerbolat Khassanov
Saida Mussakhojayeva
A. Mirzakhmetov
A. Adiyev
Mukhamet Nurpeiissov
H. A. Varol
6
30
0
22 Sep 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
27
316
0
09 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
T. Toda
19
38
0
07 Aug 2020
Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model
Qi Liu
Zhehuai Chen
Hao Li
Mingkun Huang
Yizhou Lu
Kai Yu
16
6
0
31 Jul 2020
Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation
Xiaoyuan Yi
Hyeonseung Lee
Wenhao Li
Hyung Yong Kim
Nam Soo Kim
9
22
0
25 Jul 2020
Speaker-Conditional Chain Model for Speech Separation and Extraction
Jing Shi
Jiaming Xu
Yusuke Fujita
Shinji Watanabe
Bo Xu
BDL
41
20
0
25 Jun 2020
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
A. Andrusenko
A. Laptev
Ivan Medennikov
VLM
16
12
0
15 Jun 2020
"Notic My Speech" -- Blending Speech Patterns With Multimedia
Dhruva Sahrawat
Yaman Kumar Singla
Shashwat Aggarwal
Yifang Yin
R. Shah
Roger Zimmermann
25
3
0
12 Jun 2020
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict
Yosuke Higuchi
Shinji Watanabe
Nanxin Chen
Tetsuji Ogawa
Tetsunori Kobayashi
17
137
0
18 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
A. Laptev
Roman Korostik
A. Svischev
A. Andrusenko
Ivan Medennikov
S. Rybin
16
61
0
14 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
19
32
0
12 May 2020
Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
K. Kinoshita
Tsubasa Ochiai
Marc Delcroix
Tomohiro Nakatani
21
97
0
09 Mar 2020
CGCNN: Complex Gabor Convolutional Neural Network on raw speech
Paul-Gauthier Noé
Titouan Parcollet
Mohamed Morchid
14
29
0
11 Feb 2020
End-to-End Multi-speaker Speech Recognition with Transformer
Xuankai Chang
Wangyou Zhang
Y. Qian
Jonathan Le Roux
Shinji Watanabe
ViT
22
103
0
10 Feb 2020
Accelerating RNN Transducer Inference via One-Step Constrained Beam Search
Juntae Kim
Yoonhan Lee
15
21
0
10 Feb 2020
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection
Takenori Yoshimura
Tomoki Hayashi
K. Takeda
Shinji Watanabe
21
49
0
03 Feb 2020
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
29
81
0
02 Jan 2020
End-to-end training of time domain audio separation and recognition
Thilo von Neumann
K. Kinoshita
Lukas Drude
Christoph Boeddeker
Marc Delcroix
Tomohiro Nakatani
Reinhold Haeb-Umbach
19
34
0
18 Dec 2019
Multimodal Machine Translation through Visuals and Speech
U. Sulubacak
Ozan Caglayan
Stig-Arne Gronroos
Aku Rouhe
Desmond Elliott
Lucia Specia
Jörg Tiedemann
46
72
0
28 Nov 2019
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
Jibin Wu
Emre Yilmaz
Malu Zhang
Haizhou Li
Kay Chen Tan
25
104
0
19 Nov 2019
Meta Learning for End-to-End Low-Resource Speech Recognition
Jui-Yang Hsu
Yuan-Jui Chen
Hung-yi Lee
22
103
0
26 Oct 2019
Towards Online End-to-end Transformer Automatic Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
22
32
0
25 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
T. Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
23
201
0
24 Oct 2019
Discriminative Neural Clustering for Speaker Diarisation
Qiujia Li
Florian Kreyssig
Chao Zhang
P. Woodland
11
44
0
22 Oct 2019
Transformer ASR with Contextual Block Processing
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
51
64
0
16 Oct 2019
Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help?
Chitralekha Gupta
Emre Yilmaz
Haizhou Li
19
14
0
23 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
H. Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
23
716
0
13 Sep 2019
IMS-Speech: A Speech to Text Tool
Pavel Denisov
Ngoc Thang Vu
8
11
0
13 Aug 2019
CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition
Linhao Dong
Bo Xu
24
125
0
27 May 2019
Acoustic-to-Word Models with Conversational Context Information
Suyoun Kim
Florian Metze
14
7
0
21 May 2019
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text
M. Baskar
Shinji Watanabe
Ramón Fernández Astudillo
Takaaki Hori
L. Burget
J. Černocký
20
42
0
30 Apr 2019
Pretraining by Backtranslation for End-to-end ASR in Low-Resource Settings
Matthew Wiesner
Adithya Renduchintala
Shinji Watanabe
Shuoyang Ding
Najim Dehak
Sanjeev Khudanpur
13
32
0
10 Dec 2018
Previous
1
2
3
4
5
6
Next