Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1603.00982
Cited By
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder
3 March 2016
Yu-An Chung
Chao-Chung Wu
Chia-Hao Shen
Hung-yi Lee
Lin-Shan Lee
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder"
44 / 94 papers shown
Title
Disentangled Speech Embeddings using Cross-modal Self-supervision
Arsha Nagrani
Joon Son Chung
Samuel Albanie
Andrew Zisserman
SSL
21
88
0
20 Feb 2020
Improving automated segmentation of radio shows with audio embeddings
Oberon Berlage
Klaus-Michael Lux
David Graus
12
5
0
12 Feb 2020
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
32
81
0
02 Jan 2020
Effectiveness of self-supervised pre-training for speech recognition
Alexei Baevski
Michael Auli
Abdel-rahman Mohamed
SSL
27
147
0
10 Nov 2019
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Alexander H. Liu
Tao Tu
Hung-yi Lee
Lin-Shan Lee
SSL
35
50
0
28 Oct 2019
Learning audio representations via phase prediction
Félix de Chaumont Quitry
Marco Tagliasacchi
Dominik Roblek
SSL
AI4TS
9
10
0
25 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
28
372
0
25 Oct 2019
Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Myunghun Jung
Hyungjun Lim
Jahyun Goo
Youngmoon Jung
Hoirin Kim
14
14
0
01 Oct 2019
Representation Learning for Electronic Health Records
W. Weng
Peter Szolovits
33
19
0
19 Sep 2019
Learning Joint Acoustic-Phonetic Word Embeddings
Mohamed El-Geish
DRL
SSL
10
2
0
01 Aug 2019
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
30
28
0
30 Jun 2019
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion
Andy T. Liu
Po-Chun Hsu
Hung-yi Lee
SSL
14
28
0
28 May 2019
Self-supervised audio representation learning for mobile devices
Marco Tagliasacchi
Beat Gfeller
Félix de Chaumont Quitry
Dominik Roblek
SSL
AI4TS
4
46
0
24 May 2019
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Yi-Chen Chen
Sung-Feng Huang
Hung-yi Lee
Lin-Shan Lee
SSL
14
0
0
10 Apr 2019
An Unsupervised Autoregressive Model for Speech Representation Learning
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
SSL
24
406
0
05 Apr 2019
Modeling Acoustic-Prosodic Cues for Word Importance Prediction in Spoken Dialogues
Sushant Kafle
Cecilia Ovesdotter Alm
Matt Huenerfauth
13
3
0
28 Mar 2019
Confusion2Vec: Towards Enriching Vector Space Word Representations with Representational Ambiguities
K. K. Thekumparampil
Zinan Lin
14
23
0
08 Nov 2018
Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Sung-Feng Huang
Yi-Chen Chen
Hung-yi Lee
Lin-Shan Lee
AI4TS
19
5
0
07 Nov 2018
Towards Unsupervised Speech-to-Text Translation
Yu-An Chung
W. Weng
S. Tong
James R. Glass
34
42
0
04 Nov 2018
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Yi-Chen Chen
Chia-Hao Shen
Sung-Feng Huang
Hung-yi Lee
Lin-Shan Lee
17
17
0
30 Oct 2018
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Yu-Hsuan Wang
Hung-yi Lee
Lin-Shan Lee
27
54
0
07 Aug 2018
Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Yi-Chen Chen
Sung-Feng Huang
Chia-Hao Shen
Hung-yi Lee
Lin-Shan Lee
46
37
0
21 Jul 2018
Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis
Timothy Wong
Zhiyuan Luo
11
12
0
10 Jul 2018
Fast ASR-free and almost zero-resource keyword spotting using DTW and CNNs for humanitarian monitoring
Raghav Menon
Herman Kamper
John Quinn
T. Niesler
16
28
0
25 Jun 2018
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
Yu-An Chung
W. Weng
S. Tong
James R. Glass
17
99
0
18 May 2018
Towards a universal neural network encoder for time series
Joan Serrà
Santiago Pascual
Alexandros Karatzoglou
AI4TS
32
119
0
10 May 2018
Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders
F. Bianchi
L. Livi
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
AI4TS
25
11
0
09 May 2018
Unspeech: Unsupervised Speech Context Embeddings
Benjamin Milde
Chris Biemann
SSL
19
28
0
18 Apr 2018
Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings
Da-Rong Liu
Kuan-Yu Chen
Hung-yi Lee
Lin-Shan Lee
SSL
21
48
0
01 Apr 2018
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only
Yi-Chen Chen
Chia-Hao Shen
Sung-Feng Huang
Hung-yi Lee
12
19
0
29 Mar 2018
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech
Yu-An Chung
James R. Glass
3DV
34
184
0
23 Mar 2018
Supervised and Unsupervised Transfer Learning for Question Answering
Yu-An Chung
Hung-yi Lee
James R. Glass
27
83
0
14 Nov 2017
Learning Word Embeddings from Speech
Yu-An Chung
James R. Glass
SSL
28
19
0
05 Nov 2017
A Tutorial on Deep Learning for Music Information Retrieval
Keunwoo Choi
Gyorgy Fazekas
Kyunghyun Cho
Mark Sandler
VLM
17
91
0
13 Sep 2017
Deep Learning Techniques for Music Generation -- A Survey
Jean-Pierre Briot
Gaëtan Hadjeres
F. Pachet
MGen
37
297
0
05 Sep 2017
Query-by-example Spoken Term Detection using Attention-based Multi-hop Networks
Chia-Wei Ao
Hung-yi Lee
11
21
0
01 Sep 2017
Learning audio sequence representations for acoustic event classification
Zixing Zhang
Ding Liu
Jing Han
Kun Qian
Björn Schuller
SSL
AI4TS
40
14
0
27 Jul 2017
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data
Chia-Hao Shen
Janet Y. Sung
Hung-yi Lee
24
5
0
19 Jul 2017
TimeNet: Pre-trained deep recurrent neural network for time series classification
Pankaj Malhotra
T. Vishnu
L. Vig
Puneet Agarwal
Gautam M. Shroff
AI4TS
17
170
0
23 Jun 2017
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries
Yu-Hsuan Wang
Cheng-Tao Chung
Hung-yi Lee
13
40
0
22 Mar 2017
Sound-Word2Vec: Learning Word Representations Grounded in Sounds
Ashwin K. Vijayakumar
Ramakrishna Vedantam
Devi Parikh
26
22
0
06 Mar 2017
End-to-End ASR-free Keyword Search from Speech
Kartik Audhkhasi
Andrew Rosenberg
A. Sethy
Bhuvana Ramabhadran
Brian Kingsbury
18
111
0
13 Jan 2017
Unsupervised neural and Bayesian models for zero-resource speech processing
Herman Kamper
SSL
BDL
16
8
0
03 Jan 2017
Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based Approaches
Shane Settle
Karen Livescu
22
87
0
08 Nov 2016
Previous
1
2