ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.08779
  4. Cited By
SpecAugment: A Simple Data Augmentation Method for Automatic Speech
  Recognition

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

18 April 2019
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
    VLM
ArXivPDFHTML

Papers citing "SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition"

50 / 757 papers shown
Title
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and
  Accented Speech
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Katrin Tomanek
Vicky Zayats
Dirk Padfield
K. Vaillancourt
Fadi Biadsy
59
57
0
14 Sep 2021
Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource
  Languages
Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages
A. C. S.
Prathosh A P
A. G. Ramakrishnan
46
12
0
12 Sep 2021
Self-Attention Channel Combinator Frontend for End-to-End Multichannel
  Far-field Speech Recognition
Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech Recognition
Rong Gong
Carl Quillen
D. Sharma
Andrew Goderre
José Laínez
Ljubomir Milanović
39
13
0
10 Sep 2021
Speechformer: Reducing Information Loss in Direct Speech Translation
Speechformer: Reducing Information Loss in Direct Speech Translation
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
70
23
0
09 Sep 2021
Non-autoregressive End-to-end Speech Translation with Parallel
  Autoregressive Rescoring
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring
Hirofumi Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
63
11
0
09 Sep 2021
Beijing ZKJ-NPU Speaker Verification System for VoxCeleb Speaker
  Recognition Challenge 2021
Beijing ZKJ-NPU Speaker Verification System for VoxCeleb Speaker Recognition Challenge 2021
Li Lyna Zhang
Huan Zhao
Qinling Meng
Yanli Chen
Min Liu
Lei Xie
32
10
0
08 Sep 2021
Investigations on Speech Recognition Systems for Low-Resource Dialectal
  Arabic-English Code-Switching Speech
Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech
Injy Hamed
Pavel Denisov
C. Li
Mohamed S. Elmahdy
Slim Abdennadher
Ngoc Thang Vu
38
35
0
29 Aug 2021
Injecting Text in Self-Supervised Speech Pretraining
Injecting Text in Self-Supervised Speech Pretraining
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Gary Wang
Pedro J. Moreno
SSL
25
36
0
27 Aug 2021
4-bit Quantization of LSTM-based Speech Recognition Models
4-bit Quantization of LSTM-based Speech Recognition Models
A. Fasoli
Chia-Yu Chen
Mauricio Serrano
Xiao Sun
Naigang Wang
...
Xiaodong Cui
Brian Kingsbury
Wei Zhang
Zoltán Tüske
K. Gopalakrishnan
MQ
26
21
0
27 Aug 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Xiaodong Cui
Brian Kingsbury
G. Saon
David Haws
Zoltán Tüske
25
5
0
24 Aug 2021
Automatic Speech Recognition And Limited Vocabulary: A Survey
Automatic Speech Recognition And Limited Vocabulary: A Survey
J. L. E. K. Fendji
D. Tala
B. Yenke
M. Atemkeng
28
3
0
23 Aug 2021
Automated Audio Captioning using Transfer Learning and Reconstruction
  Latent Space Similarity Regularization
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
Andrew Koh
Fuzhao Xue
Chng Eng Siong
22
20
0
10 Aug 2021
The HW-TSC's Offline Speech Translation Systems for IWSLT 2021
  Evaluation
The HW-TSC's Offline Speech Translation Systems for IWSLT 2021 Evaluation
Minghan Wang
Yuxia Wang
Chang Su
Jiaxin Guo
Yingtao Zhang
...
Shimin Tao
Xingshan Zeng
Liangyou Li
Hao Yang
Ying Qin
22
6
0
09 Aug 2021
SpecMix : A Mixed Sample Data Augmentation method for Training
  withTime-Frequency Domain Features
SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain Features
Gwantae Kim
D. Han
Hanseok Ko
50
42
0
06 Aug 2021
Improved Speech Emotion Recognition using Transfer Learning and
  Spectrogram Augmentation
Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Sarala Padi
S. O. Sadjadi
Tianyi Zhou
Ram D. Sriram
26
34
0
05 Aug 2021
A Study of Multilingual End-to-End Speech Recognition for Kazakh,
  Russian, and English
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English
Saida Mussakhojayeva
Yerbolat Khassanov
H. A. Varol
22
17
0
03 Aug 2021
End-to-End Spectro-Temporal Graph Attention Networks for Speaker
  Verification Anti-Spoofing and Speech Deepfake Detection
End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection
Hemlata Tak
Jee-weon Jung
J. Patino
Madhu R. Kamble
Massimiliano Todisco
Nicholas W. D. Evans
43
162
0
27 Jul 2021
OLR 2021 Challenge: Datasets, Rules and Baselines
OLR 2021 Challenge: Datasets, Rules and Baselines
Binling Wang
Wen-Bo Hu
Jing Li
Yiming Zhi
Zheng Li
Q. Hong
Lin Li
Dong Wang
Liming Song
Cheng Yang
34
18
0
23 Jul 2021
Audio Captioning Transformer
Audio Captioning Transformer
Xinhao Mei
Xubo Liu
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
ViT
39
77
0
21 Jul 2021
Simultaneous Speech Translation for Live Subtitling: from Delay to
  Display
Simultaneous Speech Translation for Live Subtitling: from Delay to Display
Alina Karakanta
Sara Papi
Matteo Negri
Marco Turchi
28
10
0
19 Jul 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
31
68
0
19 Jul 2021
Between Flexibility and Consistency: Joint Generation of Captions and
  Subtitles
Between Flexibility and Consistency: Joint Generation of Captions and Subtitles
Alina Karakanta
Marco Gaido
Matteo Negri
Marco Turchi
30
9
0
13 Jul 2021
Conformer-based End-to-end Speech Recognition With Rotary Position
  Embedding
Conformer-based End-to-end Speech Recognition With Rotary Position Embedding
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
29
9
0
13 Jul 2021
Direct speech-to-speech translation with discrete units
Direct speech-to-speech translation with discrete units
Ann Lee
Peng-Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
...
Yossi Adi
Qing He
Yun Tang
J. Pino
Wei-Ning Hsu
41
181
0
12 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context
  ASR models
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Xiaohui Zhang
Vimal Manohar
David C. Zhang
Frank Zhang
Yangyang Shi
Nayan Singhal
Julian Chan
Fuchun Peng
Yatharth Saraf
M. Seltzer
34
14
0
09 Jul 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline
  Task
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Laohu Wang
Canan Huang
Tong Xiao
Jingbo Zhu
34
5
0
06 Jul 2021
Oriental Language Recognition (OLR) 2020: Summary and Analysis
Oriental Language Recognition (OLR) 2020: Summary and Analysis
Jing Li
Binling Wang
Yiming Zhi
Zheng Li
Lin Li
Q. Hong
Dong Wang
27
10
0
05 Jul 2021
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust
  Neural Acoustic Scene Classification
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification
Hao Yen
Chao-Han Huck Yang
Hu Hu
Sabato Marco Siniscalchi
Qing Wang
...
Yuanjun Zhao
Yuzhong Wu
Yannan Wang
Jun Du
Chin-Hui Lee
19
16
0
03 Jul 2021
Supervised Contrastive Learning for Accented Speech Recognition
Supervised Contrastive Learning for Accented Speech Recognition
Tao Han
Hantao Huang
Ziang Yang
Wei Han
49
15
0
02 Jul 2021
ESPnet-ST IWSLT 2021 Offline Speech Translation System
ESPnet-ST IWSLT 2021 Offline Speech Translation System
Hirofumi Inaguma
Shun Kiyono
Nelson Enrique Yalta Soplin
Pengcheng Guo
Jun Suzuki
Kevin Duh
Shinji Watanabe
3DV
40
2
0
01 Jul 2021
The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at
  IWSLT 2021
The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Dan Liu
Mengge Du
Xiaoxi Li
Yuchen Hu
Lirong Dai
32
20
0
01 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
48
544
0
30 Jun 2021
Robust and Interpretable Temporal Convolution Network for Event
  Detection in Lung Sound Recordings
Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings
Tharindu Fernando
Sridha Sridharan
Simon Denman
H. Ghaemmaghami
Clinton Fookes
38
27
0
30 Jun 2021
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic
  Sound Event Localization and Detection
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection
Thi Ngoc Tho Nguyen
Karn N. Watcharasupat
Ngoc Khanh Nguyen
Douglas L. Jones
W. Gan
27
16
0
29 Jun 2021
SCARF: Self-Supervised Contrastive Learning using Random Feature
  Corruption
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri
Heinrich Jiang
Yi Tay
Donald Metzler
SSL
31
164
0
29 Jun 2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
44
6
0
23 Jun 2021
Do sound event representations generalize to other audio tasks? A case
  study in audio transfer learning
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning
Anurag Kumar
Yun Wang
V. Ithapu
Christian Fuegen
24
3
0
21 Jun 2021
Towards sound based testing of COVID-19 -- Summary of the first
  Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge
Towards sound based testing of COVID-19 -- Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge
N. Sharma
Ananya Muguli
Prashant Krishnan
Rohit Kumar
Srikanth Raj Chetupalli
Sriram Ganapathy
35
13
0
21 Jun 2021
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse
  Response Simulation for Sound Event Localization and Detection
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection
Kazuki Shimada
Naoya Takahashi
Yuichiro Koyama
Shusuke Takahashi
E. Tsunoo
Masafumi Takahashi
Yuki Mitsufuji
30
23
0
21 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
35
9
0
17 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
23
88
0
17 Jun 2021
Collaborative Training of Acoustic Encoders for Speech Recognition
Collaborative Training of Acoustic Encoders for Speech Recognition
Varun K. Nagaraja
Yangyang Shi
Ganesh Venkatesh
Ozlem Kalinli
M. Seltzer
Vikas Chandra
48
11
0
16 Jun 2021
VidHarm: A Clip Based Dataset for Harmful Content Detection
VidHarm: A Clip Based Dataset for Harmful Content Detection
Johan Edstedt
Amanda Berg
Michael Felsberg
Johan Karlsson
Francisca Benavente
Anette Novak
G. Pihlgren
28
2
0
15 Jun 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition
SynthASR: Unlocking Synthetic Data for Speech Recognition
A. Fazel
Wei Yang
Yulan Liu
Roberto Barra-Chicote
Yi Meng
Roland Maas
J. Droppo
SyDa
21
48
0
14 Jun 2021
End-to-end Neural Diarization: From Transformer to Conformer
End-to-end Neural Diarization: From Transformer to Conformer
Yi Y. Liu
Eunjung Han
Chul Lee
A. Stolcke
22
40
0
14 Jun 2021
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional
  Weighted-Shrinking Transformer
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional Weighted-Shrinking Transformer
Xingshan Zeng
Liangyou Li
Qun Liu
25
45
0
09 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
26
753
0
08 Jun 2021
Broadcasted Residual Learning for Efficient Keyword Spotting
Broadcasted Residual Learning for Efficient Keyword Spotting
Byeonggeun Kim
Simyung Chang
Jinkyu Lee
Dooyong Sung
31
122
0
08 Jun 2021
EventDrop: data augmentation for event-based learning
EventDrop: data augmentation for event-based learning
Fuqiang Gu
Weicong Sng
Xuke Hu
Fei Yu
24
37
0
07 Jun 2021
Data Augmentation Methods for End-to-end Speech Recognition on
  Distant-Talk Scenarios
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
E. Tsunoo
Kentarou Shibata
Chaitanya Narisetty
Yosuke Kashiwagi
Shinji Watanabe
27
12
0
07 Jun 2021
Previous
123...101112...141516
Next