ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.07467
  4. Cited By
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

18 May 2018
Yu-An Chung
W. Weng
S. Tong
James R. Glass
ArXivPDFHTML

Papers citing "Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces"

24 / 24 papers shown
Title
Temporally Aligning Long Audio Interviews with Questions: A Case Study
  in Multimodal Data Integration
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
P. Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
51
0
0
10 Oct 2023
Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal
  Sponsored Search
Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search
Yuanmin Tang
Daling Wang
Keke Gai
Wenfang Wu
Yifei Zhang
Gang Xiong
Qi Wu
31
4
0
28 Sep 2023
Variational Connectionist Temporal Classification for Order-Preserving
  Sequence Modeling
Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling
Zheng Nan
T. Dang
V. Sethu
Beena Ahmed
BDL
27
2
0
21 Sep 2023
Information Theory-Guided Heuristic Progressive Multi-View Coding
Information Theory-Guided Heuristic Progressive Multi-View Coding
Jiangmeng Li
Hang Gao
Wenwen Qiang
Changwen Zheng
22
2
0
21 Aug 2023
Homophone Reveals the Truth: A Reality Check for Speech2Vec
Homophone Reveals the Truth: A Reality Check for Speech2Vec
Guangyu Chen
16
0
0
22 Sep 2022
Distilling a Pretrained Language Model to a Multilingual ASR Model
Distilling a Pretrained Language Model to a Multilingual ASR Model
Kwanghee Choi
Hyung-Min Park
VLM
31
11
0
25 Jun 2022
Non-Parametric Domain Adaptation for End-to-End Speech Translation
Non-Parametric Domain Adaptation for End-to-End Speech Translation
Yichao Du
Weizhi Wang
Zhirui Zhang
Boxing Chen
Tong Xu
Jun Xie
Enhong Chen
53
18
0
23 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
137
352
0
21 May 2022
Unsupervised Mismatch Localization in Cross-Modal Sequential Data with
  Application to Mispronunciations Localization
Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization
Wei Wei
Hengguan Huang
Xiangming Gu
Hao Wang
Ye Wang
BDL
27
0
0
05 May 2022
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Algayres Robin
Adel Nabli
Benoît Sagot
Emmanuel Dupoux
SSL
23
8
0
11 Apr 2022
An Analysis of Semantically-Aligned Speech-Text Embeddings
An Analysis of Semantically-Aligned Speech-Text Embeddings
M. Huzaifah
Ivan Kukanov
35
7
0
04 Apr 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
40
106
0
02 Mar 2022
Unsupervised Automatic Speech Recognition: A Review
Unsupervised Automatic Speech Recognition: A Review
Hanan Aldarmaki
Asad Ullah
Nazar Zaki
VLM
SSL
39
57
0
09 Jun 2021
Speech Emotion Recognition using Semantic Information
Speech Emotion Recognition using Semantic Information
Panagiotis Tzirakis
Anh-Tuan Nguyen
S. Zafeiriou
Björn W. Schuller
17
19
0
04 Mar 2021
Deep Partial Multi-View Learning
Deep Partial Multi-View Learning
Changqing Zhang
Yajie Cui
Zongbo Han
Qiufeng Wang
Huazhu Fu
Q. Hu
32
220
0
12 Nov 2020
Towards Semi-Supervised Semantics Understanding from Speech
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
22
7
0
11 Nov 2020
SpeechBERT: An Audio-and-text Jointly Learned Language Model for
  End-to-end Spoken Question Answering
SpeechBERT: An Audio-and-text Jointly Learned Language Model for End-to-end Spoken Question Answering
Yung-Sung Chuang
Chi-Liang Liu
Hung-yi Lee
Lin-shan Lee
AuLLM
24
39
0
25 Oct 2019
Representation Learning for Electronic Health Records
Representation Learning for Electronic Health Records
W. Weng
Peter Szolovits
36
19
0
19 Sep 2019
Universal Adversarial Audio Perturbations
Universal Adversarial Audio Perturbations
Sajjad Abdoli
L. G. Hafemann
Jérôme Rony
Ismail Ben Ayed
P. Cardinal
Alessandro Lameiras Koerich
AAML
25
51
0
08 Aug 2019
Self-supervised audio representation learning for mobile devices
Self-supervised audio representation learning for mobile devices
Marco Tagliasacchi
Beat Gfeller
Félix de Chaumont Quitry
Dominik Roblek
SSL
AI4TS
4
46
0
24 May 2019
From Semi-supervised to Almost-unsupervised Speech Recognition with
  Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text
  Embeddings
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Yi-Chen Chen
Sung-Feng Huang
Hung-yi Lee
Lin-Shan Lee
SSL
16
0
0
10 Apr 2019
Unsupervised Speech Recognition via Segmental Empirical Output
  Distribution Matching
Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching
Chih-Kuan Yeh
Jianshu Chen
Chengzhu Yu
Dong Yu
13
40
0
23 Dec 2018
Unsupervised Multimodal Representation Learning across Medical Images
  and Reports
Unsupervised Multimodal Representation Learning across Medical Images and Reports
T. Hsu
W. Weng
Willie Boag
Matthew B. A. McDermott
Peter Szolovits
SSL
25
35
0
21 Nov 2018
Word Translation Without Parallel Data
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
189
1,639
0
11 Oct 2017
1