Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder

3 March 2016

Papers citing "Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder"

50 / 94 papers shown

Title
Visually Grounded Speech Models have a Mutual Exclusivity Bias Leanne Nortje Dan Oneaţă Yevgen Matusevych Herman Kamper SSL 47 0 0 20 Mar 2024
Acoustic models of Brazilian Portuguese Speech based on Neural Transformers M. Gauy Marcelo Finger 22 2 0 14 Dec 2023
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors Shuyue Stella Li Beining Xu Xiangyu Zhang Hexin Liu Wen-Han Chao Leibny Paola García SSL 37 4 0 27 Nov 2023
Spoken Word2Vec: Learning Skipgram Embeddings from Speech Mohammad Amaan Sayeed Hanan Aldarmaki 22 0 0 15 Nov 2023
Matching Latent Encoding for Audio-Text based Keyword Spotting K. Nishu Minsik Cho Devang Naik 9 14 0 08 Jun 2023
Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili C. Jacobs Nathanaël Carraz Rakotonirina E. Chimoto Bruce A. Bassett Herman Kamper 27 5 0 01 Jun 2023
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss Hiroshi Sato Ryo Masumura Tsubasa Ochiai Marc Delcroix Takafumi Moriya ... Kentaro Shinayama Saki Mizuno Mana Ihori Tomohiro Tanaka Nobukatsu Hojo 37 5 0 24 May 2023
Exploring How Generative Adversarial Networks Learn Phonological Representations Jing Chen Micha Elsner GAN 19 3 0 21 May 2023
A Survey on Time-Series Pre-Trained Models Qianli Ma Ziqiang Liu Zhenjing Zheng Ziyang Huang Siying Zhu Zhongzhong Yu James T. Kwok AI4TS 31 50 0 18 May 2023
End-to-End Speech Recognition: A Survey Rohit Prabhavalkar Takaaki Hori Tara N. Sainath Ralf Schluter Shinji Watanabe VLM 26 150 0 03 Mar 2023
Supervised Acoustic Embeddings And Their Transferability Across Languages Sreepratha Ram Hanan Aldarmaki SSL 24 3 0 03 Jan 2023
TESSP: Text-Enhanced Self-Supervised Speech Pre-training Zhuoyuan Yao Shuo Ren Sanyuan Chen Ziyang Ma Pengcheng Guo Linfu Xie 24 5 0 24 Nov 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach Xulong Zhang Jianzong Wang Ning Cheng Kexin Zhu Jing Xiao 21 0 0 25 Oct 2022
TVLT: Textless Vision-Language Transformer Zineng Tang Jaemin Cho Yixin Nie Joey Tianyi Zhou VLM 51 28 0 28 Sep 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network Da-Rong Liu Po-Chun Hsu Yi-Chen Chen Sung-Feng Huang Shun-Po Chuang Da-Yi Wu Hung-yi Lee GAN 15 7 0 29 Jul 2022
Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation Jeong-Eun Choi Seongwon Jang Hyunsouk Cho Sehee Chung SSL 16 6 0 10 Jul 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 137 350 0 21 May 2022
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning Algayres Robin Adel Nabli Benoît Sagot Emmanuel Dupoux SSL 23 8 0 11 Apr 2022
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition N. J. Wang Zongfeng Quan Shaojun Wang Jing Xiao 23 1 0 08 Apr 2022
Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings Myunghun Jung Hoirin Kim 19 3 0 30 Mar 2022
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data Gašper Beguš Alan Zhou SSL 27 4 0 22 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 19 11 0 01 Mar 2022
On Training Targets and Activation Functions for Deep Representation Learning in Text-Dependent Speaker Verification A. Sarkar Zheng-Hua Tan 16 2 0 17 Jan 2022
Deep Spoken Keyword Spotting: An Overview Iván López-Espejo Zheng-Hua Tan John H. L. Hansen Jesper Jensen 21 100 0 20 Nov 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training Ankur Bapna Yu-An Chung Na Wu Anmol Gulati Ye Jia J. Clark Melvin Johnson Jason Riesa Alexis Conneau Yu Zhang VLM 61 94 0 20 Oct 2021
Interpreting intermediate convolutional layers in unsupervised acoustic word classification Gašper Beguš Alan Zhou FAtt SSL 33 5 0 05 Oct 2021
Modeling Dynamics of Facial Behavior for Mental Health Assessment Minh Tran Ellen R. Bradley Michelle Matvey J. Woolley M. Soleymani CVBM 17 3 0 23 Aug 2021
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation Jian Luo Jianzong Wang Ning Cheng Jing Xiao SSL 27 6 0 09 Jul 2021
Unsupervised Automatic Speech Recognition: A Review Hanan Aldarmaki Asad Ullah Nazar Zaki VLM SSL 39 56 0 09 Jun 2021
A Novel Semi-supervised Framework for Call Center Agent Malpractice Detection via Neural Feature Learning cSukru Ozan Leonardo O. Iheme 12 4 0 04 Jun 2021
Unsupervised Discriminative Learning of Sounds for Audio Event Classification Sascha Hornauer Ke Li Stella X. Yu Shabnam Ghaffarzadegan Liu Ren SSL 26 5 0 19 May 2021
Interpreting intermediate convolutional layers of generative CNNs trained on waveforms Gašper Beguš Alan Zhou 27 7 0 19 Apr 2021
Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales Jacob Andreas Gašper Beguš M. Bronstein R. Diamant Denley Delaney ... D. Tchernov P. Tønnesen Antonio Torralba Daniel M. Vogt Robert J. Wood 43 10 0 17 Apr 2021
Utilizing Self-supervised Representations for MOS Prediction Wei-Cheng Tseng Chien-yu Huang Wei-Tsung Kao Yist Y. Lin Hung-yi Lee SSL 27 63 0 07 Apr 2021
Auto-KWS 2021 Challenge: Task, Datasets, and Baselines Jingsong Wang Yuxuan He Chunyu Zhao Qijie Shao Wei-Wei Tu Tom Ko Hung-yi Lee Lei Xie 26 4 0 31 Mar 2021
Broad-UNet: Multi-scale feature learning for nowcasting tasks Jesús García Fernández S. Mehrkanoon 27 66 0 12 Feb 2021
A comparison of self-supervised speech representations as input features for unsupervised acoustic word embeddings Lisa van Staden Herman Kamper SSL 31 16 0 14 Dec 2020
Acoustic span embeddings for multilingual query-by-example search Yushi Hu Shane Settle Karen Livescu RALM 17 8 0 24 Nov 2020
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training Sung-Feng Huang Shun-Po Chuang Da-Rong Liu Yi-Chen Chen Gene-Ping Yang Hung-yi Lee SSL 39 14 0 29 Oct 2020
Probing Acoustic Representations for Phonetic Properties Danni Ma Neville Ryant M. Liberman 25 45 0 25 Oct 2020
Contrastive Learning of General-Purpose Audio Representations Aaqib Saeed David Grangier Neil Zeghidour VLM SSL 24 262 0 21 Oct 2020
Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication Gašper Beguš GAN SSL 8 15 0 13 Sep 2020
Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder Si-Ioi Ng Tan Lee 9 7 0 07 Aug 2020
Evaluating computational models of infant phonetic learning across languages Yevgen Matusevych Thomas Schatz Herman Kamper Naomi H Feldman Sharon Goldwater 19 14 0 06 Aug 2020
Evaluating the reliability of acoustic speech embeddings Robin Algayres Mohamed Salah Zaiem Benoît Sagot Emmanuel Dupoux 38 29 0 27 Jul 2020
Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings Bowen Shi Shane Settle Karen Livescu 22 4 0 01 Jul 2020
CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks Gašper Beguš GAN 6 33 0 04 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder Kazi Nazmul Haque R. Rana Björn W Schuller DRL 26 12 0 01 Jun 2020
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding Yu-An Chung James R. Glass SSL 15 56 0 11 Apr 2020
Analyzing autoencoder-based acoustic word embeddings Yevgen Matusevych Herman Kamper Sharon Goldwater 30 12 0 03 Apr 2020