SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation

17 May 2022

Papers citing "SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation"

23 / 23 papers shown

Title
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation J. Duret Yannick Esteve Titouan Parcollet 36 0 0 08 Jul 2024
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect Salima Mdhaffar Haroun Elleuch Fethi Bougares Yannick Esteve 49 0 0 05 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation Rao Ma Yassir Fathullah Mengjie Qian Siyuan Tang Mark J. F. Gales Kate Knill 20 1 0 01 Jul 2024
A dual task learning approach to fine-tune a multilingual semantic speech encoder for Spoken Language Understanding G. Laperriere Sahar Ghannay Bassam Jabaian Yannick Esteve 21 0 0 17 Jun 2024
Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems Frank Palma Gomez Ramon Sanabria Yun-hsuan Sung Daniel Matthew Cer Siddharth Dalmia Gustavo Hernández Ábrego VLM 33 3 0 02 Apr 2024
New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark Nadege Alavoine G. Laperriere Christophe Servan Sahar Ghannay Sophie Rosset VLM 22 0 0 28 Mar 2024
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language Jian Zhu Changbing Yang Farhan Samir Jahurul Islam 25 4 0 14 Nov 2023
Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer Paul-Ambroise Duquenne Holger Schwenk Benoît Sagot 29 3 0 05 Oct 2023
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning William Chen Jiatong Shi Brian Yan Dan Berrebbi Wangyou Zhang Yifan Peng Xuankai Chang Soumi Maiti Shinji Watanabe 24 8 0 26 Sep 2023
Cross-modal Alignment with Optimal Transport for CTC-based ASR Xugang Lu Peng Shen Yu Tsao Hisashi Kawai 17 4 0 24 Sep 2023
Direct Text to Speech Translation System using Acoustic Units Victoria Mingote Pablo Gimeno Luis Vicente Sameer Khurana Antoine Laurent J. Duret 17 3 0 14 Sep 2023
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations Paul-Ambroise Duquenne Holger Schwenk Benoît Sagot AI4TS VLM 16 58 0 22 Aug 2023
Semantic enrichment towards efficient speech representations G. Laperriere H. Nguyen Sahar Ghannay Bassam Jabaian Yannick Esteve 43 2 0 03 Jul 2023
NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track Edward Gow-Smith Alexandre Berard Marcely Zanon Boito Ioan Calapodescu 10 12 0 13 Jun 2023
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation Sameer Khurana Nauman Dawalatabad Antoine Laurent Luis Vicente Pablo Gimeno Victoria Mingote James R. Glass VLM 14 1 0 01 Jun 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages Yu Zhang Wei Han James Qin Yongqiang Wang Ankur Bapna ... Pedro J. Moreno Chung-Cheng Chiu J. Schalkwyk Franccoise Beaufays Yonghui Wu VLM 77 250 0 02 Mar 2023
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric Mingda Chen Paul-Ambroise Duquenne Pierre Yves Andrews Justine T. Kao Alexandre Mourachko Holger Schwenk Marta R. Costa-jussá 14 17 0 16 Dec 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations Paul-Ambroise Duquenne Hongyu Gong Ning Dong Jingfei Du Ann Lee Vedanuj Goswani Changhan Wang J. Pino Benoît Sagot Holger Schwenk 21 34 0 08 Nov 2022
token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text Xianghu Yue Junyi Ao Xiaoxue Gao Haizhou Li SSL 26 8 0 30 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings Jian Zhu Zuoyu Tian Yadong Liu Cong Zhang Chia-wen Lo SSL 30 2 0 23 Oct 2022
On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding G. Laperriere Valentin Pelloin Mickael Rouvier Themos Stafylakis Yannick Esteve 27 9 0 11 Oct 2022
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model Yi-Jen Shih Hsuan-Fu Wang Heng-Jui Chang Layne Berry Hung-yi Lee David F. Harwath VLM CLIP 38 32 0 03 Oct 2022
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,724 0 26 Sep 2016