Title
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder Khai Le-Duc Phuc Phan Tan-Hanh Pham Bach Phan Tat Minh-Huong Ngo Chris Ngo Thanh Nguyen-Tang Truong Son-Hy LM&MA 43 0 0 21 Sep 2024
VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain Khai Le-Duc LM&MA 36 9 0 08 Apr 2024
Real-Time Multimodal Cognitive Assistant for Emergency Medical Services Keshara Weerasinghe Saahith Janapati Xueren Ge Sion Kim S. Iyer John A. Stankovic H. Alemzadeh 26 2 0 11 Mar 2024
Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview Heyang Liu Yu Wang Yanfeng Wang 38 0 0 01 Mar 2024
Towards Conversational Diagnostic AI Tao Tu Anil Palepu M. Schaekermann Khaled Saab Jan Freyberg ... Katherine Chou Greg S. Corrado Yossi Matias Alan Karthikesalingam Vivek Natarajan AI4MH LM&MA 26 92 0 11 Jan 2024
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project Khai Le-Duc 13 2 0 26 Sep 2023
Using Text Injection to Improve Recognition of Personal Identifiers in Speech Yochai Blau Rohan Agrawal Lior Madmony Gary Wang Andrew Rosenberg Zhehuai Chen Zorik Gekhman Genady Beryozkin Parisa Haghani Bhuvana Ramabhadran 27 3 0 14 Aug 2023
Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech Christoph Luscher Mohammad Zeineldeen Zijian Yang Tina Raissi Peter Vieting Khai Le-Duc Weiyue Wang Ralf Schluter Hermann Ney 13 5 0 24 Oct 2022
Speech Detection For Child-Clinician Conversations In Danish For Low-Resource In-The-Wild Conditions: A Case Study Sneha Das N. Lønfeldt A. Pagsberg Line H. Clemmensen 14 3 0 25 Apr 2022
PriMock57: A Dataset Of Primary Care Mock Consultations Alex Papadopoulos Korfiatis Francesco Moramarco Radmila Sarac Aleksandar Savkov 19 24 0 01 Apr 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey Ngoc Dung Huynh Mohamed Reda Bouadjenek Imran Razzak Kevin Lee Chetan Arora Ali Hassani A. Zaslavsky AAML 23 6 0 22 Feb 2022
Natural Language Processing for Smart Healthcare Binggui Zhou Guanghua Yang Zheng Shi Shaodan Ma AI4MH LM&MA 27 111 0 19 Oct 2021
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation Ye Jia Michelle Tadmor Ramanovich Tal Remez Roi Pomerantz 26 67 0 19 Jul 2021
Understanding Medical Conversations: Rich Transcription, Confidence Scores & Information Extraction H. Soltau Mingqiu Wang Izhak Shafran Laurent El Shafey MedIm LM&MA 15 12 0 06 Apr 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning Tae Jin Park Naoyuki Kanda Dimitrios Dimitriadis Kyu Jeong Han Shinji Watanabe Shrikanth Narayanan VLM 271 327 0 24 Jan 2021
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling Jonathan Shen Ye Jia Mike Chrzanowski Yu Zhang Isaac Elias Heiga Zen Yonghui Wu 19 112 0 08 Oct 2020
Towards an Automated SOAP Note: Classifying Utterances from Medical Conversations Benjamin Schloss Sandeep Konam 8 22 0 17 Jul 2020
Robust Prediction of Punctuation and Truecasing for Medical ASR Monica Sunkara S. Ronanki Kalpit Dixit S. Bodapati Katrin Kirchhoff 9 33 0 04 Jul 2020
SpecAugment on Large Scale Datasets Daniel S. Park Yu Zhang Chung-Cheng Chiu Youzheng Chen Bo-wen Li William Chan Quoc V. Le Yonghui Wu 18 136 0 11 Dec 2019
Advances in Online Audio-Visual Meeting Transcription Takuya Yoshioka Igor Abramovski Cem Aksoylar Zhuo Chen Moshe David ... Huaming Wang Zhenghao Wang Jun Zhang Yong Zhao Tianyan Zhou 23 73 0 10 Dec 2019
Extracting Symptoms and their Status from Clinical Conversations Nan Du Kai Chen Anjuli Kannan Linh Tran Yuhui Chen Izhak Shafran 12 68 0 05 Jun 2019
Audio De-identification: A New Entity Recognition Task Ido Cohn Itay Laish Genady Beryozkin Gang Li Izhak Shafran Idan Szpektor Tzvika Hartman Avinatan Hassidim Yossi Matias 14 29 0 17 Mar 2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling Jonathan Shen Patrick Nguyen Yonghui Wu Z. Chen M. Chen ... William Chan Shubham Toshniwal Baohua Liao M. Nirschl Pat Rondon VLM 25 209 0 21 Feb 2019
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks Takuya Yoshioka Hakan Erdogan Zhuo Chen Xiong Xiao F. Alleva BDL 22 81 0 08 Oct 2018
Automatic Documentation of ICD Codes with Far-Field Speech Recognition Albert Haque Corinna Fukushima 11 0 0 30 Apr 2018
End-to-End Multimodal Speech Recognition Shruti Palaskar Ramon Sanabria Florian Metze 17 41 0 25 Apr 2018