Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.08180
Cited By
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
17 May 2022
Sameer Khurana
Antoine Laurent
James R. Glass
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation"
23 / 23 papers shown
Title
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation
J. Duret
Yannick Esteve
Titouan Parcollet
36
0
0
08 Jul 2024
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect
Salima Mdhaffar
Haroun Elleuch
Fethi Bougares
Yannick Esteve
49
0
0
05 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation
Rao Ma
Yassir Fathullah
Mengjie Qian
Siyuan Tang
Mark J. F. Gales
Kate Knill
20
1
0
01 Jul 2024
A dual task learning approach to fine-tune a multilingual semantic speech encoder for Spoken Language Understanding
G. Laperriere
Sahar Ghannay
Bassam Jabaian
Yannick Esteve
21
0
0
17 Jun 2024
Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems
Frank Palma Gomez
Ramon Sanabria
Yun-hsuan Sung
Daniel Matthew Cer
Siddharth Dalmia
Gustavo Hernández Ábrego
VLM
33
3
0
02 Apr 2024
New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark
Nadege Alavoine
G. Laperriere
Christophe Servan
Sahar Ghannay
Sophie Rosset
VLM
22
0
0
28 Mar 2024
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
Jian Zhu
Changbing Yang
Farhan Samir
Jahurul Islam
25
4
0
14 Nov 2023
Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Paul-Ambroise Duquenne
Holger Schwenk
Benoît Sagot
29
3
0
05 Oct 2023
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning
William Chen
Jiatong Shi
Brian Yan
Dan Berrebbi
Wangyou Zhang
Yifan Peng
Xuankai Chang
Soumi Maiti
Shinji Watanabe
24
8
0
26 Sep 2023
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Xugang Lu
Peng Shen
Yu Tsao
Hisashi Kawai
17
4
0
24 Sep 2023
Direct Text to Speech Translation System using Acoustic Units
Victoria Mingote
Pablo Gimeno
Luis Vicente
Sameer Khurana
Antoine Laurent
J. Duret
17
3
0
14 Sep 2023
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
Paul-Ambroise Duquenne
Holger Schwenk
Benoît Sagot
AI4TS
VLM
16
58
0
22 Aug 2023
Semantic enrichment towards efficient speech representations
G. Laperriere
H. Nguyen
Sahar Ghannay
Bassam Jabaian
Yannick Esteve
43
2
0
03 Jul 2023
NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track
Edward Gow-Smith
Alexandre Berard
Marcely Zanon Boito
Ioan Calapodescu
10
12
0
13 Jun 2023
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Sameer Khurana
Nauman Dawalatabad
Antoine Laurent
Luis Vicente
Pablo Gimeno
Victoria Mingote
James R. Glass
VLM
14
1
0
01 Jun 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
77
250
0
02 Mar 2023
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
Mingda Chen
Paul-Ambroise Duquenne
Pierre Yves Andrews
Justine T. Kao
Alexandre Mourachko
Holger Schwenk
Marta R. Costa-jussá
14
17
0
16 Dec 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
21
34
0
08 Nov 2022
token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Xianghu Yue
Junyi Ao
Xiaoxue Gao
Haizhou Li
SSL
26
8
0
30 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
30
2
0
23 Oct 2022
On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding
G. Laperriere
Valentin Pelloin
Mickael Rouvier
Themos Stafylakis
Yannick Esteve
27
9
0
11 Oct 2022
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Yi-Jen Shih
Hsuan-Fu Wang
Heng-Jui Chang
Layne Berry
Hung-yi Lee
David F. Harwath
VLM
CLIP
38
32
0
03 Oct 2022
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,724
0
26 Sep 2016
1