Title
Imitator: Personalized Speech-driven 3D Facial AnimationIEEE International Conference on Computer Vision (ICCV), 2022 Balamurugan Thambiraja I. Habibie S. Aliakbarian Darren Cosker Christian Theobalt Justus Thies CVBM 248 89 0 30 Dec 2022
End-to-End Automatic Speech Recognition model for the Sudanese Dialect Ayman Mansour Wafaa F. Mukhtar 72 1 0 21 Dec 2022
KL Regularized Normalization Framework for Low Resource Tasks Neeraj Kumar Ankur Narang Brejesh Lall 139 1 0 21 Dec 2022
VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion Hanbo Cai Pengcheng Zhang Hai Dong Yan Xiao Shunhui Ji 141 7 0 20 Dec 2022
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and FairnessAPSIPA Transactions on Signal and Information Processing (TASIP), 2022 Tiantian Feng Rajat Hebbar Nicholas Mehlman Xuan Shi Aditya Kommineni and Shrikanth Narayanan 240 37 0 18 Dec 2022
An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty Zhijie Wang Yuheng Huang Lei Ma Haruki Yokoyama Susumu Tokumoto Kazuki Munakata 210 6 0 13 Dec 2022
Estimator: An Effective and Scalable Framework for Transportation Mode Classification over Trajectories Danlei Hu Ziquan Fang Hanxi Fang Tianyi Li Chun-ru Shen Lu Chen Yunjun Gao 164 9 0 11 Dec 2022
Memories are One-to-Many Mapping Alleviators in Talking Face GenerationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022 Anni Tang Tianyu He Xuejiao Tan Jun Ling Liang Song CVBM 313 27 0 09 Dec 2022
Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators Abhishek Tyagi Yiming Gan Shaoshan Liu Bo Yu P. Whatmough Yuhao Zhu AAML 245 12 0 05 Dec 2022
PiPar: Pipeline Parallelism for Collaborative Machine Learning Zihan Zhang Philip Rodgers Peter Kilpatrick I. Spence Blesson Varghese FedML 267 6 0 01 Dec 2022
Evaluating and reducing the distance between synthetic and real speech distributionsInterspeech (Interspeech), 2022 Christoph Minixhofer Ondˇrej Klejch P. Bell 217 9 0 29 Nov 2022
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges K. T. Baghaei Amirreza Payandeh Pooya Fayyazsanavi Shahram Rahimi Zhiqian Chen Somayeh Bakhtiari Ramezani FaML AI4TS 215 10 0 27 Nov 2022
Dynamic Neural PortraitsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022 M. Doukas Stylianos Ploumpis Stefanos Zafeiriou 3DH 126 1 0 25 Nov 2022
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural NetworksInternational Conference on Parallel Processing (ICPP), 2022 Zining Zhang Bingsheng He Zhenjie Zhang 114 6 0 21 Nov 2022
Phonemic Adversarial Attack against Audio Recognition in Real World Jinyang Guo Zhendong Chen Zixin Yin Qinghong Yang Xianglong Liu AAML 132 5 0 19 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up Luke Metz James Harrison C. Freeman Amil Merchant Lucas Beyer ... Naman Agrawal Ben Poole Igor Mordatch Adam Roberts Jascha Narain Sohl-Dickstein 290 75 0 17 Nov 2022
Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature ReviewInteracción (IN), 2022 Mikel K. Ngueajio Gloria J. Washington 199 42 0 17 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations Renée Lu M. Shahin Beena Ahmed 181 7 0 14 Nov 2022
FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs Hossein Katebi Navidreza Asadi M. Goudarzi MQ 146 1 0 13 Nov 2022
MSDT: Masked Language Model Scoring Defense in Text DomainInternational Conference on Universal Village (ICUV), 2022 Jaechul Roh Minhao Cheng Yajun Fang AAML 86 1 0 10 Nov 2022
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First RegularizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Zhengkun Tian Hongyu Xiang Min Li Fei Lin Ke Ding Guanglu Wan 116 7 0 07 Nov 2022
Data-free Defense of Black Box Models Against Adversarial Attacks Gaurav Kumar Nayak Inder Khatri Ruchit Rawal Anirban Chakraborty AAML 161 2 0 03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingNeural Information Processing Systems (NeurIPS), 2022 Yonggan Fu Yang Zhang Kaizhi Qian Zhifan Ye Zhongzhi Yu Cheng-I Jeff Lai Yingyan Lin 332 10 0 02 Nov 2022
Modular Hybrid Autoregressive TransducerSpoken Language Technology Workshop (SLT), 2022 Zhong Meng Tongzhou Chen Rohit Prabhavalkar Yu Zhang Gary Wang ... Bhuvana Ramabhadran Wenjie Huang Ehsan Variani Yinghui Huang Pedro J. Moreno 173 27 0 31 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Yosuke Higuchi Brian Yan Siddhant Arora Tetsuji Ogawa Tetsunori Kobayashi Shinji Watanabe 250 31 0 29 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition Sanchit Gandhi Patrick von Platen Alexander M. Rush 137 27 0 24 Oct 2022
10 hours data is all you need Zeping Min Qian Ge Zhong Li 165 3 0 24 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain LossesSpoken Language Technology Workshop (SLT), 2022 C. Li Ngoc Thang Vu 129 3 0 20 Oct 2022
Accelerating Transfer Learning with Near-Data Computation on Cloud Object StoresACM Symposium on Cloud Computing (SoCC), 2022 Arsany Guirguis Diana Petrescu Florin Dinu D. Quoc Javier Picorel R. Guerraoui 218 0 0 16 Oct 2022
Deep learning model compression using network sensitivity and gradients M. Sakthi N. Yadla Raj Pawate 164 2 0 11 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMTConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Mayumi Ohta Julia Kreutzer Stefan Riezler 151 0 0 05 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognitionSpoken Language Technology Workshop (SLT), 2022 Kwangyoun Kim Felix Wu Yifan Peng Jing Pan Prashant Sridhar Kyu Jeong Han Shinji Watanabe 381 156 0 30 Sep 2022
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech Saeed Ghorbani Ylva Ferstl Daniel Holden N. Troje M. Carbonneau 276 117 0 15 Sep 2022
Deep Speech Synthesis from Articulatory RepresentationsInterspeech (Interspeech), 2022 Peter Wu Shinji Watanabe Louis Goldstein A. Black Gopala K. Anumanchipalli 198 34 0 13 Sep 2022
Synthesizing Photorealistic Virtual Humans Through Cross-modal DisentanglementComputer Vision and Pattern Recognition (CVPR), 2022 S. Ravichandran Ondrej Texler Dimitar Dinev Hyun Jae Kang 137 4 0 03 Sep 2022
Universal Fourier Attack for Time SeriesIEEE Open Journal of Signal Processing (JOSP), 2022 Elizabeth Coda B. Clymer Chance N. DeSmet Y. Watkins Michael Girard 169 1 0 02 Sep 2022
RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency IoT systemsIEEE Transactions on Network Science and Engineering (IEEE T-NSE), 2022 Emna Baccour A. Erbad Amr M. Mohamed Mounir Hamdi Mohsen Guizani 119 16 0 27 Aug 2022
Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich SystemsInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022 Prasoon Sinha Akhil Guliani Rutwik Jain Brandon Tran Matthew D. Sinclair Shivaram Venkataraman 181 29 0 23 Aug 2022
How does the degree of novelty impacts semi-supervised representation learning for novel class retrieval? Q. Leroy Olivier Buisson Alexis Joly SSL 79 0 0 17 Aug 2022
Unifying Gradients to Improve Real-world Robustness for Deep NetworksACM Transactions on Intelligent Systems and Technology (ACM TIST), 2022 Yingwen Wu Sizhe Chen Kun Fang Xiaolin Huang AAML 203 4 0 12 Aug 2022
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN TrainingSymposium on Networked Systems Design and Implementation (NSDI), 2022 Jie You Jaehoon Chung Mosharaf Chowdhury 280 119 0 12 Aug 2022
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognitionSpoken Language Technology Workshop (SLT), 2022 Peng Shen Xugang Lu Hisashi Kawai 105 2 0 29 Jul 2022
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head SynthesisEuropean Conference on Computer Vision (ECCV), 2022 Shuai Shen Wanhua Li Zhengbiao Zhu Yueqi Duan Jie Zhou Jiwen Lu CVBM 181 131 0 24 Jul 2022
Improving spatial cues for hearables using a parameterized binaural CDR estimator Reza Ghanavi C. Jin 60 1 0 17 Jul 2022
End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource settingComputer Speech and Language (CSL), 2022 Thierry Desot François Portet Michel Vacher 103 15 0 17 Jul 2022
pMCT: Patched Multi-Condition Training for Robust Speech RecognitionInterspeech (Interspeech), 2022 Pablo Peso Parada A. Dobrowolska Karthikeyan P. Saravanan Mete Ozay 228 11 0 11 Jul 2022
Adversarial Ensemble Training by Jointly Learning Label Dependencies and Member ModelsInternational Conference on Intelligent Computing (ICIC), 2022 Lele Wang B. Liu UQCV 331 7 0 29 Jun 2022
The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech RecognitionInternational Conference on Language Resources and Evaluation (LREC), 2022 Jonathan Mukiibi Andrew Katumba J. Nakatumba‐Nabende Ali Hussein Josh Meyer 213 7 0 20 Jun 2022
Residual Language Model for End-to-end Speech RecognitionInterspeech (Interspeech), 2022 E. Tsunoo Yosuke Kashiwagi Chaitanya Narisetty Shinji Watanabe 131 11 0 15 Jun 2022
Local Identifiability of Deep ReLU Neural Networks: the TheoryNeural Information Processing Systems (NeurIPS), 2022 Joachim Bona-Pellissier Franccois Malgouyres François Bachoc FAtt 292 11 0 15 Jun 2022

All Papers

Deep Speech: Scaling up end-to-end speech recognition

Papers citing "Deep Speech: Scaling up end-to-end speech recognition"