Title
Joint Speech Recognition and Audio CaptioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Chaitanya Narisetty E. Tsunoo Xuankai Chang Yosuke Kashiwagi Michael Hentschel Shinji Watanabe 106 10 0 03 Feb 2022
Imperceptible and Multi-channel Backdoor Attack against Deep Neural Networks Mingfu Xue S. Ni Ying-Chang Wu Yushu Zhang Jian Wang Weiqiang Liu AAML 180 18 0 31 Jan 2022
The Norwegian Parliamentary Speech CorpusInternational Conference on Language Resources and Evaluation (LREC), 2022 Per Erik Solberg Pablo Ortiz 98 15 0 26 Jan 2022
Internal Language Model Estimation Through Explicit Context Vector Learning for Attention-based Encoder-decoder ASRInterspeech (Interspeech), 2022 Yufei Liu Rao Ma Haihua Xu Yi He Zejun Ma Weibin Zhang 128 15 0 26 Jan 2022
Improved Mispronunciation detection system using a hybrid CTC-ATT based approach for L2 English speakers Neha Baranwal Sharatkumar Chilaka 92 3 0 25 Jan 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Qiu-shi Zhu Jie Zhang Zi-qiang Zhang Ming Wu Xin Fang Lirong Dai 281 51 0 22 Jan 2022
Semantic-Aware Implicit Neural Audio-Driven Video Portrait GenerationEuropean Conference on Computer Vision (ECCV), 2022 Xian Liu Yinghao Xu Qianyi Wu Hang Zhou Wayne Wu Bolei Zhou VGen DiffM 3DH 173 161 0 19 Jan 2022
Transferability in Deep Learning: A Survey Junguang Jiang Yang Shu Jianmin Wang Mingsheng Long OOD 189 128 0 15 Jan 2022
Robust Self-Supervised Audio-Visual Speech RecognitionInterspeech (Interspeech), 2022 Bowen Shi Wei-Ning Hsu Abdel-rahman Mohamed 273 115 0 05 Jan 2022
Discrete and continuous representations and processing in deep learning: Looking forwardAI Open (AO), 2022 Ruben Cartuyvels Graham Spinks Marie-Francine Moens OCL 293 28 0 04 Jan 2022
DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering Shunyu Yao Ruizhe Zhong Manwen Liao Guangtao Zhai Xiaokang Yang CVBM 147 113 0 03 Jan 2022
Making AI 'Smart': Bridging AI and Cognitive Science Madhav Agarwal Siddhant Bansal 145 0 0 31 Dec 2021
Towards Relatable Explainable AI with the Perceptual ProcessInternational Conference on Human Factors in Computing Systems (CHI), 2021 Wencan Zhang Brian Y. Lim AAML XAI 248 70 0 28 Dec 2021
Multi-Dialect Arabic Speech RecognitionIEEE International Joint Conference on Neural Network (IJCNN), 2020 Abbas Raza Ali 77 19 0 25 Dec 2021
Multi-Variant Consistency based Self-supervised Learning for Robust Automatic Speech Recognition Changfeng Gao Gaofeng Cheng Pengyuan Zhang 245 4 0 23 Dec 2021
A Comprehensive Analytical Survey on Unsupervised and Semi-Supervised Graph Representation Learning Methods Md. Khaledur Rahman A. Azad AI4TS 107 3 0 20 Dec 2021
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated Label Mixing Joonhyung Park J. Yang Jinwoo Shin Sung Ju Hwang Eunho Yang 163 26 0 16 Dec 2021
On the Use of External Data for Spoken Named Entity Recognition Ankita Pasad Felix Wu Suwon Shon Karen Livescu Kyu Jeong Han 115 17 0 14 Dec 2021
Real-Time Neural Voice Camouflage Mia Chiquier Chengzhi Mao Carl Vondrick 162 8 0 14 Dec 2021
Perceptual Loss with Recognition Model for Single-Channel Enhancement and Robust ASR Peter William VanHarn Plantinga Deblin Bagchi Eric Fosler-Lussier 174 10 0 11 Dec 2021
Are E2E ASR models ready for an industrial usage? Valentin Vielzeuf G. Antipov 274 8 0 09 Dec 2021
FastSGD: A Fast Compressed SGD Framework for Distributed Machine Learning Keyu Yang Lu Chen Zhihao Zeng Yunjun Gao 150 9 0 08 Dec 2021
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules Xinfeng Xie Prakash Prabhu Ulysse Beaugnon P. Phothilimthana Sudip Roy Azalia Mirhoseini E. Brevdo James Laudon Yanqi Zhou 118 6 0 07 Dec 2021
Training end-to-end speech-to-text models on mobile phones S. Zitha Raghavendra Rao Suresh Pooja S B. Rao T. V. Prabhakar 142 1 0 07 Dec 2021
On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective Xiaowu Dai Yuhua Zhu 124 8 0 02 Dec 2021
Automated Speech Scoring System Under The Lens: Evaluating and interpreting the linguistic cues for language proficiency P. Bamdev Manraj Singh Grover Yaman Kumar Singla Payman Vafaee Mika Hama R. Shah 146 16 0 30 Nov 2021
Factorized Fourier Neural OperatorsInternational Conference on Learning Representations (ICLR), 2021 Alasdair Tran A. Mathews Lexing Xie Cheng Soon Ong AI4CE 367 219 0 27 Nov 2021
Romanian Speech Recognition Experiments from the ROBIN Project Andrei-Marius Avram Vasile Puaics Dan Tufics 128 5 0 23 Nov 2021
Human-Machine Interaction Speech Corpus from the ROBIN projectInternational Conference on Speech Technology and Human-Computer Dialogue (ICSTHD), 2021 V. Pais Radu Ion Andrei-Marius Avram Elena Irimia V. Mititelu Maria Mitrofan 120 7 0 22 Nov 2021
Denoised Internal Models: a Brain-Inspired Autoencoder against Adversarial AttacksMachine Intelligence Research (MIR), 2021 Kaiyuan Liu Xingyu Li Yu-Rui Lai Hong Xie Hang Su Jiacheng Wang Chunxu Guo J. Guan Yi Zhou AAML 234 4 0 21 Nov 2021
The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage Daniel Galvez G. Diamos Juan Ciro Juan Felipe Cerón Keith Achorn Anjali Gopi David Kanter Maximilian Lam Mark Mazumder Vijay Janapa Reddi 211 122 0 17 Nov 2021
A Survey on Adversarial Attacks for Malware AnalysisIEEE Access (IEEE Access), 2021 Kshitiz Aryal Maanak Gupta Mahmoud Abdelsalam AAML 258 64 0 16 Nov 2021
Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale RepresentationInternational Conference on Big Knowledge (ICBK), 2021 Kentaro Ohno Atsutoshi Kumagai CLL AI4CE 91 10 0 05 Nov 2021
Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel Kevin Eloff Okko Räsänen H. Engelbrecht Arnu Pretorius Herman Kamper 166 3 0 04 Nov 2021
Speech recognition for air traffic control via feature learning and end-to-end training Peng Fan Dongyue Guo Yi Lin Bo Yang Jianwei Zhang 133 8 0 04 Nov 2021
RT-RCG: Neural Network and Accelerator Search Towards Effective and Real-time ECG Reconstruction from Intracardiac ElectrogramsACM Journal on Emerging Technologies in Computing Systems (JETC), 2021 Yongan Zhang Anton Banta Yonggan Fu M. John A. Post M. Razavi Joseph R. Cavallaro B. Aazhang Yingyan Lin 133 4 0 04 Nov 2021
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021 Jinyu Li VLM 382 424 0 02 Nov 2021
EfficientWord-Net: An Open Source Hotword Detection Engine based on One-shot LearningJournal of Information & Knowledge Management (JIKM), 2021 R. Chidhambararajan Aman Rangaur S. C. Sethuraman 129 6 0 31 Oct 2021
Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against ASRs H. Abdullah Muhammad Sajidur Rahman Christian Peeters Cassidy Gibson Washington Garcia Vincent Bindschaedler T. Shrimpton Patrick Traynor AAML 85 12 0 25 Oct 2021
Asynchronous Decentralized Distributed Training of Acoustic ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021 Xiaodong Cui Wei Zhang Abdullah Kayi Mingrui Liu Ulrich Finkler Brian Kingsbury G. Saon David S. Kung 109 3 0 21 Oct 2021
Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep LearningConference on Machine Learning and Systems (MLSys), 2021 Ningning Xie Tamara Norman Dominik Grewe Dimitrios Vytiniotis 193 19 0 20 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis Max Morrison Rithesh Kumar Kundan Kumar Prem Seetharaman Aaron Courville Yoshua Bengio GAN 170 84 0 19 Oct 2021
Self-Supervised Representation Learning: Introduction, Advances and Challenges Linus Ericsson Henry Gouk Chen Change Loy Timothy M. Hospedales SSL OOD AI4TS 198 338 0 18 Oct 2021
Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor Anchit Gupta Faizan Farooq Khan Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar CVBM 177 6 0 16 Oct 2021
Multistage linguistic conditioning of convolutional layers for speech emotion recognition Andreas Triantafyllopoulos U. Reichel Shuo Liu Simon Huber F. Eyben Björn W. Schuller 182 17 0 13 Oct 2021
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and SyllablesInterspeech (Interspeech), 2021 Jounghee Kim Pilsung Kang VLM 108 7 0 11 Oct 2021
Boosting Fast Adversarial Training with Learnable Adversarial InitializationIEEE Transactions on Image Processing (TIP), 2021 Yang Liu Yong Zhang Baoyuan Wu Jue Wang Xiaochun Cao AAML 275 65 0 11 Oct 2021
SCaLa: Supervised Contrastive Learning for End-to-End Speech RecognitionInterspeech (Interspeech), 2021 Li Fu Xiaoxiao Li Runyu Wang Lu Fan Zhengchen Zhang Meng Chen Youzheng Wu Xiaodong He SSL 148 3 0 08 Oct 2021
Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees Yuanchao Wang Wenjing Du Chenghao Cai Yanyan Xu 145 1 0 08 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing Shuo Yang Le Hou Xiaodan Song Qiang Liu Denny Zhou 257 9 0 08 Oct 2021