Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder

3 March 2016

Papers citing "Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder"

44 / 94 papers shown

Title
Disentangled Speech Embeddings using Cross-modal Self-supervision Arsha Nagrani Joon Son Chung Samuel Albanie Andrew Zisserman SSL 21 88 0 20 Feb 2020
Improving automated segmentation of radio shows with audio embeddings Oberon Berlage Klaus-Michael Lux David Graus 12 5 0 12 Feb 2020
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends S. Latif R. Rana Sara Khalifa Raja Jurdak Junaid Qadir Björn W. Schuller AI4TS 32 81 0 02 Jan 2020
Effectiveness of self-supervised pre-training for speech recognition Alexei Baevski Michael Auli Abdel-rahman Mohamed SSL 27 147 0 10 Nov 2019
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning Alexander H. Liu Tao Tu Hung-yi Lee Lin-Shan Lee SSL 35 50 0 28 Oct 2019
Learning audio representations via phase prediction Félix de Chaumont Quitry Marco Tagliasacchi Dominik Roblek SSL AI4TS 9 10 0 25 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders Andy T. Liu Shu-Wen Yang Po-Han Chi Po-Chun Hsu Hung-yi Lee SSL 28 372 0 25 Oct 2019
Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings Myunghun Jung Hyungjun Lim Jahyun Goo Youngmoon Jung Hoirin Kim 14 14 0 01 Oct 2019
Representation Learning for Electronic Health Records W. Weng Peter Szolovits 33 19 0 19 Sep 2019
Learning Joint Acoustic-Phonetic Word Embeddings Mohamed El-Geish DRL SSL 10 2 0 01 Aug 2019
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition Shaoshi Ling Julian Salazar Yuzong Liu Katrin Kirchhoff SSL 30 28 0 30 Jun 2019
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion Andy T. Liu Po-Chun Hsu Hung-yi Lee SSL 14 28 0 28 May 2019
Self-supervised audio representation learning for mobile devices Marco Tagliasacchi Beat Gfeller Félix de Chaumont Quitry Dominik Roblek SSL AI4TS 4 46 0 24 May 2019
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings Yi-Chen Chen Sung-Feng Huang Hung-yi Lee Lin-Shan Lee SSL 14 0 0 10 Apr 2019
An Unsupervised Autoregressive Model for Speech Representation Learning Yu-An Chung Wei-Ning Hsu Hao Tang James R. Glass SSL 24 406 0 05 Apr 2019
Modeling Acoustic-Prosodic Cues for Word Importance Prediction in Spoken Dialogues Sushant Kafle Cecilia Ovesdotter Alm Matt Huenerfauth 13 3 0 28 Mar 2019
Confusion2Vec: Towards Enriching Vector Space Word Representations with Representational Ambiguities K. K. Thekumparampil Zinan Lin 14 23 0 08 Nov 2018
Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection Sung-Feng Huang Yi-Chen Chen Hung-yi Lee Lin-Shan Lee AI4TS 19 5 0 07 Nov 2018
Towards Unsupervised Speech-to-Text Translation Yu-An Chung W. Weng S. Tong James R. Glass 34 42 0 04 Nov 2018
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data Yi-Chen Chen Chia-Hao Shen Sung-Feng Huang Hung-yi Lee Lin-Shan Lee 17 17 0 30 Oct 2018
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection Yu-Hsuan Wang Hung-yi Lee Lin-Shan Lee 27 54 0 07 Aug 2018
Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval Yi-Chen Chen Sung-Feng Huang Chia-Hao Shen Hung-yi Lee Lin-Shan Lee 46 37 0 21 Jul 2018
Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis Timothy Wong Zhiyuan Luo 11 12 0 10 Jul 2018
Fast ASR-free and almost zero-resource keyword spotting using DTW and CNNs for humanitarian monitoring Raghav Menon Herman Kamper John Quinn T. Niesler 16 28 0 25 Jun 2018
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces Yu-An Chung W. Weng S. Tong James R. Glass 17 99 0 18 May 2018
Towards a universal neural network encoder for time series Joan Serrà Santiago Pascual Alexandros Karatzoglou AI4TS 32 119 0 10 May 2018
Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders F. Bianchi L. Livi Karl Øyvind Mikalsen Michael C. Kampffmeyer Robert Jenssen AI4TS 25 11 0 09 May 2018
Unspeech: Unsupervised Speech Context Embeddings Benjamin Milde Chris Biemann SSL 19 28 0 18 Apr 2018
Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings Da-Rong Liu Kuan-Yu Chen Hung-yi Lee Lin-Shan Lee SSL 21 48 0 01 Apr 2018
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only Yi-Chen Chen Chia-Hao Shen Sung-Feng Huang Hung-yi Lee 12 19 0 29 Mar 2018
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech Yu-An Chung James R. Glass 3DV 34 184 0 23 Mar 2018
Supervised and Unsupervised Transfer Learning for Question Answering Yu-An Chung Hung-yi Lee James R. Glass 27 83 0 14 Nov 2017
Learning Word Embeddings from Speech Yu-An Chung James R. Glass SSL 28 19 0 05 Nov 2017
A Tutorial on Deep Learning for Music Information Retrieval Keunwoo Choi Gyorgy Fazekas Kyunghyun Cho Mark Sandler VLM 17 91 0 13 Sep 2017
Deep Learning Techniques for Music Generation -- A Survey Jean-Pierre Briot Gaëtan Hadjeres F. Pachet MGen 37 297 0 05 Sep 2017
Query-by-example Spoken Term Detection using Attention-based Multi-hop Networks Chia-Wei Ao Hung-yi Lee 11 21 0 01 Sep 2017
Learning audio sequence representations for acoustic event classification Zixing Zhang Ding Liu Jing Han Kun Qian Björn Schuller SSL AI4TS 40 14 0 27 Jul 2017
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data Chia-Hao Shen Janet Y. Sung Hung-yi Lee 24 5 0 19 Jul 2017
TimeNet: Pre-trained deep recurrent neural network for time series classification Pankaj Malhotra T. Vishnu L. Vig Puneet Agarwal Gautam M. Shroff AI4TS 17 170 0 23 Jun 2017
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries Yu-Hsuan Wang Cheng-Tao Chung Hung-yi Lee 13 40 0 22 Mar 2017
Sound-Word2Vec: Learning Word Representations Grounded in Sounds Ashwin K. Vijayakumar Ramakrishna Vedantam Devi Parikh 26 22 0 06 Mar 2017
End-to-End ASR-free Keyword Search from Speech Kartik Audhkhasi Andrew Rosenberg A. Sethy Bhuvana Ramabhadran Brian Kingsbury 18 111 0 13 Jan 2017
Unsupervised neural and Bayesian models for zero-resource speech processing Herman Kamper SSL BDL 16 8 0 03 Jan 2017
Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based Approaches Shane Settle Karen Livescu 22 87 0 08 Nov 2016