ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1603.00982
  4. Cited By
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations
  using Sequence-to-sequence Autoencoder

Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder

3 March 2016
Yu-An Chung
Chao-Chung Wu
Chia-Hao Shen
Hung-yi Lee
Lin-Shan Lee
    AI4TS
ArXivPDFHTML

Papers citing "Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder"

44 / 94 papers shown
Title
Disentangled Speech Embeddings using Cross-modal Self-supervision
Disentangled Speech Embeddings using Cross-modal Self-supervision
Arsha Nagrani
Joon Son Chung
Samuel Albanie
Andrew Zisserman
SSL
21
88
0
20 Feb 2020
Improving automated segmentation of radio shows with audio embeddings
Improving automated segmentation of radio shows with audio embeddings
Oberon Berlage
Klaus-Michael Lux
David Graus
12
5
0
12 Feb 2020
Deep Representation Learning in Speech Processing: Challenges, Recent
  Advances, and Future Trends
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
32
81
0
02 Jan 2020
Effectiveness of self-supervised pre-training for speech recognition
Effectiveness of self-supervised pre-training for speech recognition
Alexei Baevski
Michael Auli
Abdel-rahman Mohamed
SSL
27
147
0
10 Nov 2019
Towards Unsupervised Speech Recognition and Synthesis with Quantized
  Speech Representation Learning
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Alexander H. Liu
Tao Tu
Hung-yi Lee
Lin-Shan Lee
SSL
35
50
0
28 Oct 2019
Learning audio representations via phase prediction
Learning audio representations via phase prediction
Félix de Chaumont Quitry
Marco Tagliasacchi
Dominik Roblek
SSL
AI4TS
9
10
0
25 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep
  Bidirectional Transformer Encoders
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
28
372
0
25 Oct 2019
Additional Shared Decoder on Siamese Multi-view Encoders for Learning
  Acoustic Word Embeddings
Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Myunghun Jung
Hyungjun Lim
Jahyun Goo
Youngmoon Jung
Hoirin Kim
14
14
0
01 Oct 2019
Representation Learning for Electronic Health Records
Representation Learning for Electronic Health Records
W. Weng
Peter Szolovits
33
19
0
19 Sep 2019
Learning Joint Acoustic-Phonetic Word Embeddings
Learning Joint Acoustic-Phonetic Word Embeddings
Mohamed El-Geish
DRL
SSL
10
2
0
01 Aug 2019
BERTphone: Phonetically-Aware Encoder Representations for
  Utterance-Level Speaker and Language Recognition
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
30
28
0
30 Jun 2019
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice
  Conversion
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion
Andy T. Liu
Po-Chun Hsu
Hung-yi Lee
SSL
14
28
0
28 May 2019
Self-supervised audio representation learning for mobile devices
Self-supervised audio representation learning for mobile devices
Marco Tagliasacchi
Beat Gfeller
Félix de Chaumont Quitry
Dominik Roblek
SSL
AI4TS
4
46
0
24 May 2019
From Semi-supervised to Almost-unsupervised Speech Recognition with
  Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text
  Embeddings
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Yi-Chen Chen
Sung-Feng Huang
Hung-yi Lee
Lin-Shan Lee
SSL
14
0
0
10 Apr 2019
An Unsupervised Autoregressive Model for Speech Representation Learning
An Unsupervised Autoregressive Model for Speech Representation Learning
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
SSL
24
406
0
05 Apr 2019
Modeling Acoustic-Prosodic Cues for Word Importance Prediction in Spoken
  Dialogues
Modeling Acoustic-Prosodic Cues for Word Importance Prediction in Spoken Dialogues
Sushant Kafle
Cecilia Ovesdotter Alm
Matt Huenerfauth
13
3
0
28 Mar 2019
Confusion2Vec: Towards Enriching Vector Space Word Representations with
  Representational Ambiguities
Confusion2Vec: Towards Enriching Vector Space Word Representations with Representational Ambiguities
K. K. Thekumparampil
Zinan Lin
14
23
0
08 Nov 2018
Improved Audio Embeddings by Adjacency-Based Clustering with
  Applications in Spoken Term Detection
Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Sung-Feng Huang
Yi-Chen Chen
Hung-yi Lee
Lin-Shan Lee
AI4TS
19
5
0
07 Nov 2018
Towards Unsupervised Speech-to-Text Translation
Towards Unsupervised Speech-to-Text Translation
Yu-An Chung
W. Weng
S. Tong
James R. Glass
34
42
0
04 Nov 2018
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based
  on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Yi-Chen Chen
Chia-Hao Shen
Sung-Feng Huang
Hung-yi Lee
Lin-Shan Lee
17
17
0
30 Oct 2018
Segmental Audio Word2Vec: Representing Utterances as Sequences of
  Vectors with Applications in Spoken Term Detection
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Yu-Hsuan Wang
Hung-yi Lee
Lin-Shan Lee
27
54
0
07 Aug 2018
Phonetic-and-Semantic Embedding of Spoken Words with Applications in
  Spoken Content Retrieval
Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Yi-Chen Chen
Sung-Feng Huang
Chia-Hao Shen
Hung-yi Lee
Lin-Shan Lee
46
37
0
21 Jul 2018
Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal
  Analysis
Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis
Timothy Wong
Zhiyuan Luo
11
12
0
10 Jul 2018
Fast ASR-free and almost zero-resource keyword spotting using DTW and
  CNNs for humanitarian monitoring
Fast ASR-free and almost zero-resource keyword spotting using DTW and CNNs for humanitarian monitoring
Raghav Menon
Herman Kamper
John Quinn
T. Niesler
16
28
0
25 Jun 2018
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
Yu-An Chung
W. Weng
S. Tong
James R. Glass
17
99
0
18 May 2018
Towards a universal neural network encoder for time series
Towards a universal neural network encoder for time series
Joan Serrà
Santiago Pascual
Alexandros Karatzoglou
AI4TS
32
119
0
10 May 2018
Learning representations for multivariate time series with missing data
  using Temporal Kernelized Autoencoders
Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders
F. Bianchi
L. Livi
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
AI4TS
25
11
0
09 May 2018
Unspeech: Unsupervised Speech Context Embeddings
Unspeech: Unsupervised Speech Context Embeddings
Benjamin Milde
Chris Biemann
SSL
19
28
0
18 Apr 2018
Completely Unsupervised Phoneme Recognition by Adversarially Learning
  Mapping Relationships from Audio Embeddings
Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings
Da-Rong Liu
Kuan-Yu Chen
Hung-yi Lee
Lin-Shan Lee
SSL
21
48
0
01 Apr 2018
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned
  Speech and Text only
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only
Yi-Chen Chen
Chia-Hao Shen
Sung-Feng Huang
Hung-yi Lee
12
19
0
29 Mar 2018
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word
  Embeddings from Speech
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech
Yu-An Chung
James R. Glass
3DV
34
184
0
23 Mar 2018
Supervised and Unsupervised Transfer Learning for Question Answering
Supervised and Unsupervised Transfer Learning for Question Answering
Yu-An Chung
Hung-yi Lee
James R. Glass
27
83
0
14 Nov 2017
Learning Word Embeddings from Speech
Learning Word Embeddings from Speech
Yu-An Chung
James R. Glass
SSL
28
19
0
05 Nov 2017
A Tutorial on Deep Learning for Music Information Retrieval
A Tutorial on Deep Learning for Music Information Retrieval
Keunwoo Choi
Gyorgy Fazekas
Kyunghyun Cho
Mark Sandler
VLM
17
91
0
13 Sep 2017
Deep Learning Techniques for Music Generation -- A Survey
Deep Learning Techniques for Music Generation -- A Survey
Jean-Pierre Briot
Gaëtan Hadjeres
F. Pachet
MGen
37
297
0
05 Sep 2017
Query-by-example Spoken Term Detection using Attention-based Multi-hop
  Networks
Query-by-example Spoken Term Detection using Attention-based Multi-hop Networks
Chia-Wei Ao
Hung-yi Lee
11
21
0
01 Sep 2017
Learning audio sequence representations for acoustic event
  classification
Learning audio sequence representations for acoustic event classification
Zixing Zhang
Ding Liu
Jing Han
Kun Qian
Björn Schuller
SSL
AI4TS
40
14
0
27 Jul 2017
Language Transfer of Audio Word2Vec: Learning Audio Segment
  Representations without Target Language Data
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data
Chia-Hao Shen
Janet Y. Sung
Hung-yi Lee
24
5
0
19 Jul 2017
TimeNet: Pre-trained deep recurrent neural network for time series
  classification
TimeNet: Pre-trained deep recurrent neural network for time series classification
Pankaj Malhotra
T. Vishnu
L. Vig
Puneet Agarwal
Gautam M. Shroff
AI4TS
17
170
0
23 Jun 2017
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and
  Its Correlation with Phoneme Boundaries
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries
Yu-Hsuan Wang
Cheng-Tao Chung
Hung-yi Lee
13
40
0
22 Mar 2017
Sound-Word2Vec: Learning Word Representations Grounded in Sounds
Sound-Word2Vec: Learning Word Representations Grounded in Sounds
Ashwin K. Vijayakumar
Ramakrishna Vedantam
Devi Parikh
26
22
0
06 Mar 2017
End-to-End ASR-free Keyword Search from Speech
End-to-End ASR-free Keyword Search from Speech
Kartik Audhkhasi
Andrew Rosenberg
A. Sethy
Bhuvana Ramabhadran
Brian Kingsbury
18
111
0
13 Jan 2017
Unsupervised neural and Bayesian models for zero-resource speech
  processing
Unsupervised neural and Bayesian models for zero-resource speech processing
Herman Kamper
SSL
BDL
16
8
0
03 Jan 2017
Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based
  Approaches
Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based Approaches
Shane Settle
Karen Livescu
22
87
0
08 Nov 2016
Previous
12