ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.07098
  4. Cited By
End-to-End Speech Recognition From the Raw Waveform
v1v2 (latest)

End-to-End Speech Recognition From the Raw Waveform

19 June 2018
Neil Zeghidour
Nicolas Usunier
Gabriel Synnaeve
R. Collobert
Emmanuel Dupoux
ArXiv (abs)PDFHTML

Papers citing "End-to-End Speech Recognition From the Raw Waveform"

34 / 34 papers shown
RepAugment: Input-Agnostic Representation-Level Augmentation for
  Respiratory Sound Classification
RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound ClassificationAnnual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2024
June-Woo Kim
Miika Toikkanen
Sangmin Bae
Minseok Kim
Ho-Young Jung
268
18
0
05 May 2024
Learning neural audio features without supervision
Learning neural audio features without supervisionInterspeech (Interspeech), 2022
Sarthak Yadav
Neil Zeghidour
SSL
150
5
0
29 Mar 2022
Shennong: a Python toolbox for audio speech features extraction
Shennong: a Python toolbox for audio speech features extraction
Mathieu Bernard
Maxime Poli
Julien Karadayi
Emmanuel Dupoux
216
9
0
10 Dec 2021
Deep Spoken Keyword Spotting: An Overview
Deep Spoken Keyword Spotting: An OverviewIEEE Access (IEEE Access), 2021
Iván López-Espejo
Zheng-Hua Tan
John H. L. Hansen
Jesper Jensen
250
140
0
20 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
509
443
0
02 Nov 2021
Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against
  ASRs
Beyond LpL_pLp​ clipping: Equalization-based Psychoacoustic Attacks against ASRs
H. Abdullah
Muhammad Sajidur Rahman
Christian Peeters
Cassidy Gibson
Washington Garcia
Vincent Bindschaedler
T. Shrimpton
Patrick Traynor
AAML
125
13
0
25 Oct 2021
Learning Sparse Analytic Filters for Piano Transcription
Learning Sparse Analytic Filters for Piano Transcription
Frank Cwitkowitz
M. Heydari
Z. Duan
346
2
0
23 Aug 2021
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally
  Recurrent Networks for End-to-End Speech Recognition
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech RecognitionInterspeech (Interspeech), 2021
Max W. Y. Lam
Jun Wang
Chao Weng
Jane Polak Scowcroft
Dong Yu
162
7
0
08 Jun 2021
Interpreting intermediate convolutional layers of generative CNNs
  trained on waveforms
Interpreting intermediate convolutional layers of generative CNNs trained on waveformsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Gašper Beguš
Alan Zhou
362
8
0
19 Apr 2021
End-to-end Audio-visual Speech Recognition with Conformers
End-to-end Audio-visual Speech Recognition with ConformersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Pingchuan Ma
Stavros Petridis
Maja Pantic
358
294
0
12 Feb 2021
LEAF: A Learnable Frontend for Audio Classification
LEAF: A Learnable Frontend for Audio ClassificationInternational Conference on Learning Representations (ICLR), 2021
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
VLMAAML
280
175
0
21 Jan 2021
Speech Command Recognition in Computationally Constrained Environments
  with a Quadratic Self-organized Operational Layer
Speech Command Recognition in Computationally Constrained Environments with a Quadratic Self-organized Operational LayerIEEE International Joint Conference on Neural Network (IJCNN), 2020
M. Soltanian
Junaid Malik
Jenni Raitoharju
Alexandros Iosifidis
S. Kiranyaz
Denmark
342
11
0
23 Nov 2020
Lightweight End-to-End Speech Recognition from Raw Audio Data Using
  Sinc-Convolutions
Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions
Ludwig Kurzinger
Nicolas Lindae
Palle Klewitz
Gerhard Rigoll
325
5
0
15 Oct 2020
End-to-End Bengali Speech Recognition
End-to-End Bengali Speech Recognition
S. Mandal
Sarthak Yadav
A. Rai
117
12
0
21 Sep 2020
Exploring Filterbank Learning for Keyword Spotting
Exploring Filterbank Learning for Keyword SpottingEuropean Signal Processing Conference (EUSIPCO), 2020
Iván López-Espejo
Zheng-Hua Tan
Jesper Jensen
210
14
0
30 May 2020
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for
  End-to-End ASR
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
Yiwen Shao
Yiming Wang
Daniel Povey
Sanjeev Khudanpur
AI4TS
218
39
0
20 May 2020
CGCNN: Complex Gabor Convolutional Neural Network on raw speech
CGCNN: Complex Gabor Convolutional Neural Network on raw speechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Paul-Gauthier Noé
Titouan Parcollet
Mohamed Morchid
166
33
0
11 Feb 2020
Single Channel Speech Enhancement Using Temporal Convolutional Recurrent
  Neural Networks
Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural NetworksAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2019
Jingdong Li
Hui Zhang
Xueliang Zhang
Changliang Li
175
9
0
02 Feb 2020
Machine learning for music genre: multifaceted review and
  experimentation with audioset
Machine learning for music genre: multifaceted review and experimentation with audiosetJournal of Intelligence and Information Systems (JIIS), 2019
Jaime Ramírez
M. Flores
VLM
163
59
0
28 Nov 2019
Small-Footprint Keyword Spotting on Raw Audio Data with
  Sinc-Convolutions
Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-ConvolutionsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Simon Mittermaier
Ludwig Kurzinger
Bernd Waschneck
Gerhard Rigoll
322
65
0
05 Nov 2019
Universal Adversarial Audio Perturbations
Universal Adversarial Audio Perturbations
Sajjad Abdoli
L. G. Hafemann
Jérôme Rony
Ismail Ben Ayed
P. Cardinal
Alessandro Lameiras Koerich
AAML
476
59
0
08 Aug 2019
Learning Waveform-Based Acoustic Models using Deep Variational
  Convolutional Neural Networks
Learning Waveform-Based Acoustic Models using Deep Variational Convolutional Neural NetworksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019
Dino Oglic
Zoran Cvetkovic
Peter Sollich
BDL
318
8
0
23 Jun 2019
End-to-End ASR for Code-switched Hindi-English Speech
End-to-End ASR for Code-switched Hindi-English Speech
B. M. L. Srivastava
Basil Abraham
Sunayana Sitaram
Rupeshkumar Mehta
Preethi Jyothi
131
4
0
22 Jun 2019
Multi-Stream End-to-End Speech Recognition
Multi-Stream End-to-End Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019
Ruizhi Li
Xiaofei Wang
Sri Harish Reddy Mallidi
Shinji Watanabe
Takaaki Hori
H. Hermansky
248
25
0
17 Jun 2019
End-to-End Environmental Sound Classification using a 1D Convolutional
  Neural Network
End-to-End Environmental Sound Classification using a 1D Convolutional Neural Network
Sajjad Abdoli
P. Cardinal
Alessandro Lameiras Koerich
221
307
0
18 Apr 2019
RawNet: Fast End-to-End Neural Vocoder
RawNet: Fast End-to-End Neural Vocoder
Yunchao He
Yujun Wang
205
2
0
10 Apr 2019
Frequency Domain Multi-channel Acoustic Modeling for Distant Speech
  Recognition
Frequency Domain Multi-channel Acoustic Modeling for Distant Speech Recognition
Minhua Wu
K. Kumatani
Shiva Sundaram
N. Strom
Björn Hoffmeister
226
40
0
13 Mar 2019
Self-Attention Networks for Connectionist Temporal Classification in
  Speech Recognition
Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition
Julian Salazar
Katrin Kirchhoff
Zhiheng Huang
AI4TS
294
124
0
22 Jan 2019
Towards Using Context-Dependent Symbols in CTC Without State-Tying
  Decision Trees
Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees
J. Chorowski
A. Lancucki
Bartosz Kostka
Michal Zapotoczny
216
5
0
14 Jan 2019
Exploring spectro-temporal features in end-to-end convolutional neural
  networks
Exploring spectro-temporal features in end-to-end convolutional neural networks
Sean Robertson
Gerald Penn
Yingxue Wang
191
4
0
01 Jan 2019
Fully Convolutional Speech Recognition
Fully Convolutional Speech Recognition
Neil Zeghidour
Qiantong Xu
Vitaliy Liptchinsky
Nicolas Usunier
Gabriel Synnaeve
R. Collobert
296
97
0
17 Dec 2018
Learning to detect dysarthria from raw speech
Learning to detect dysarthria from raw speech
Juliette Millet
Neil Zeghidour
280
48
0
27 Nov 2018
Multi-encoder multi-resolution framework for end-to-end speech
  recognition
Multi-encoder multi-resolution framework for end-to-end speech recognition
Ruizhi Li
Xiaofei Wang
Sri Harish Reddy Mallidi
Takaaki Hori
Shinji Watanabe
H. Hermansky
164
13
0
12 Nov 2018
Single-Microphone Speech Enhancement and Separation Using Deep Learning
Single-Microphone Speech Enhancement and Separation Using Deep Learning
Morten Kolbaek
247
7
0
31 Aug 2018
1
Page 1 of 1