Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech

23 March 2018

Papers citing "Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech"

39 / 39 papers shown

Title
Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition Olga Iakovenko Ivan Bondarenko 27 0 0 03 Oct 2024
Efficiency-oriented approaches for self-supervised speech representation learning Luis Lugo Valentin Vielzeuf SSL 29 1 0 18 Dec 2023
Leveraging multilingual transfer for unsupervised semantic acoustic word embeddings C. Jacobs Herman Kamper 32 1 0 05 Jul 2023
Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person L. Gris R. Marcacini Arnaldo Cândido Júnior Edresson Casanova A. S. Soares S. Aluísio 21 7 0 23 May 2023
Transformers in Speech Processing: A Survey S. Latif Aun Zaidi Heriberto Cuayáhuitl Fahad Shamshad Moazzam Shoukat Junaid Qadir 42 47 0 21 Mar 2023
Supervised Acoustic Embeddings And Their Transferability Across Languages Sreepratha Ram Hanan Aldarmaki SSL 24 3 0 03 Jan 2023
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge Ewan Dunbar Nicolas Hamilakis Emmanuel Dupoux SSL 32 30 0 27 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR Zhehuai Chen Ankur Bapna Andrew Rosenberg Yu Zhang Bhuvana Ramabhadran Pedro J. Moreno Nanxin Chen 38 17 0 18 Oct 2022
TVLT: Textless Vision-Language Transformer Zineng Tang Jaemin Cho Yixin Nie Joey Tianyi Zhou VLM 51 28 0 28 Sep 2022
Homophone Reveals the Truth: A Reality Check for Speech2Vec Guangyu Chen 10 0 0 22 Sep 2022
Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings Badr M. Abdullah Bernd Möbius Dietrich Klakow 11 3 0 14 Sep 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 131 350 0 21 May 2022
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition N. J. Wang Zongfeng Quan Shaojun Wang Jing Xiao 13 1 0 08 Apr 2022
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT Rui Wang Qibing Bai Junyi Ao Long Zhou Zhixiang Xiong Zhihua Wei Yu Zhang Tom Ko Haizhou Li 34 61 0 29 Mar 2022
Audio Self-supervised Learning: A Survey Shuo Liu Adria Mallol-Ragolta Emilia Parada-Cabeleiro Kun Qian Xingshuo Jing Alexander Kathan Bin Hu Bjoern W. Schuller SSL 35 106 0 02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 19 11 0 01 Mar 2022
Capitalization and Punctuation Restoration: a Survey V. Pais D. Tufis 19 19 0 21 Nov 2021
Attention is All You Need? Good Embeddings with Statistics are enough:Large Scale Audio Understanding without Transformers/ Convolutions/ BERTs/ Mixers/ Attention/ RNNs or .... Prateek Verma AI4TS 32 2 0 07 Oct 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model Ankita Pasad Ju-Chieh Chou Karen Livescu SSL 26 288 0 10 Jul 2021
Improved Language Identification Through Cross-Lingual Self-Supervised Learning Andros Tjandra Diptanu Gon Choudhury Frank Zhang Kritika Singh Alexis Conneau Alexei Baevski Assaf Sela Yatharth Saraf Michael Auli VLM SSL 24 35 0 08 Jul 2021
Multilingual transfer of acoustic word embeddings improves when training on languages related to the target zero-resource language C. Jacobs Herman Kamper 32 10 0 24 Jun 2021
Unsupervised Automatic Speech Recognition: A Review Hanan Aldarmaki Asad Ullah Nazar Zaki VLM SSL 39 56 0 09 Jun 2021
Speech Emotion Recognition using Semantic Information Panagiotis Tzirakis Anh-Tuan Nguyen S. Zafeiriou Björn W. Schuller 15 19 0 04 Mar 2021
A comparison of self-supervised speech representations as input features for unsupervised acoustic word embeddings Lisa van Staden Herman Kamper SSL 28 16 0 14 Dec 2020
A Correspondence Variational Autoencoder for Unsupervised Acoustic Word Embeddings Puyuan Peng Herman Kamper Karen Livescu DRL SSL 14 14 0 03 Dec 2020
Utterance-level Intent Recognition from Keywords Wenda Chen Jonathan Huang M. Hasegawa-Johnson 11 1 0 17 Sep 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition Alexis Conneau Alexei Baevski R. Collobert Abdel-rahman Mohamed Michael Auli SSL 70 754 0 24 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder Kazi Nazmul Haque R. Rana Björn W Schuller DRL 26 12 0 01 Jun 2020
Identification of primary and collateral tracks in stuttered speech Rachid Riad Anne-Catherine Bachoud-Lévi Frank Rudzicz Emmanuel Dupoux 8 18 0 02 Mar 2020
Multilingual acoustic word embedding models for processing zero-resource languages Herman Kamper Yevgen Matusevych Sharon Goldwater 25 24 0 06 Feb 2020
Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion Wen-Chin Huang Hao Luo Hsin-Te Hwang Chen-Chou Lo Yu-Huai Peng Yu Tsao Hsin-Min Wang DRL 17 42 0 22 Jan 2020
Effectiveness of self-supervised pre-training for speech recognition Alexei Baevski Michael Auli Abdel-rahman Mohamed SSL 27 147 0 10 Nov 2019
SpeechBERT: An Audio-and-text Jointly Learned Language Model for End-to-end Spoken Question Answering Yung-Sung Chuang Chi-Liang Liu Hung-yi Lee Lin-shan Lee AuLLM 22 39 0 25 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders Andy T. Liu Shu-Wen Yang Po-Han Chi Po-Chun Hsu Hung-yi Lee SSL 26 372 0 25 Oct 2019
Generative Pre-Training for Speech with Autoregressive Predictive Coding Yu-An Chung James R. Glass SSL 17 173 0 23 Oct 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations Alexei Baevski Steffen Schneider Michael Auli SSL 11 660 0 12 Oct 2019
Audio-Linguistic Embeddings for Spoken Sentences Albert Haque Michelle Guo Prateek Verma Li Fei-Fei 25 51 0 20 Feb 2019
Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval Yi-Chen Chen Sung-Feng Huang Chia-Hao Shen Hung-yi Lee Lin-Shan Lee 46 37 0 21 Jul 2018
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 218 7,925 0 17 Aug 2015