Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.11466
Cited By
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
22 August 2023
Paul-Ambroise Duquenne
Holger Schwenk
Benoît Sagot
AI4TS
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SONAR: Sentence-Level Multimodal and Language-Agnostic Representations"
39 / 39 papers shown
Title
Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
Maxime Bouthors
Josep Crego
François Yvon
RALM
LRM
44
0
0
30 Apr 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Xiaoye Qu
Yafu Li
Zhaochen Su
Weigao Sun
Jianhao Yan
...
Chaochao Lu
Yue Zhang
Xian-Sheng Hua
Bowen Zhou
Yu Cheng
ReLM
OffRL
LRM
80
12
0
27 Mar 2025
Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation
A. Zebaze
Benoît Sagot
Rachel Bawden
70
0
0
06 Mar 2025
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Heeseung Kim
Che Hyun Lee
S. Park
Jiheum Yeom
Nohil Park
Sangwon Yu
Sungroh Yoon
64
0
0
27 Feb 2025
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
Hyunji Lee
Danni Liu
Supriti Sinhamahapatra
Jan Niehues
106
0
0
21 Feb 2025
BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation
Omnilingual MT Team
Pierre Yves Andrews
Mikel Artetxe
Mariano Coria Meglioli
Marta R. Costa-jussá
...
Eduardo Sánchez
Ioannis Tsiamas
Arina Turkatenko
Albert Ventayol-Boada
Shireen Yates
98
0
0
06 Feb 2025
News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation
Andreea Iana
Fabian David Schmidt
Goran Glavas
Heiko Paulheim
63
3
0
20 Jan 2025
Y-NQ: English-Yorùbá Evaluation dataset for Open-Book Reading Comprehension and Text Generation
Marta R. Costa-jussá
Joy Chen
Ifeoluwanimi Adebara
Joe Chuang
C. Ropers
Eduardo Sánchez
83
0
0
11 Dec 2024
BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages
Sparsh Jain
Ashwin Sankar
Devilal Choudhary
Dhairya Suman
Nikhil Narasimhan
Mohammed Safi Ur Rahman Khan
Anoop Kunchukuttan
Mitesh M. Khapra
Raj Dabre
37
2
0
07 Nov 2024
PersianRAG: A Retrieval-Augmented Generation System for Persian Language
Hossein Hosseini
Mohammad Sobhan Zare
Amir Hossein Mohammadi
Arefeh Kazemi
Zahra Zojaji
Mohammad Ali Nematbakhsh
VLM
RALM
34
0
0
05 Nov 2024
SpeechQE: Estimating the Quality of Direct Speech Translation
HyoJung Han
Kevin Duh
Marine Carpuat
34
0
0
28 Oct 2024
IsoChronoMeter: A simple and effective isochronic translation evaluation metric
Nikolai Rozanov
Vikentiy Pankov
Dmitrii Mukhutdinov
Dima Vypirailenko
24
1
0
14 Oct 2024
Multi-Target Cross-Lingual Summarization: a novel task and a language-neutral approach
Diogo Pernes
Gonçalo M. Correia
Afonso Mendes
16
1
0
01 Oct 2024
Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach
Siqi Li
Danni Liu
Jan Niehues
21
0
0
13 Sep 2024
Advancing Topic Segmentation of Broadcasted Speech with Multilingual Semantic Embeddings
Sakshi Deo Shukla
Pavel Denisov
Tuğtekin Turan
16
0
0
10 Sep 2024
Exploring Retrieval Augmented Generation in Arabic
S. El-Beltagy
Mohamed A. Abdallah
RALM
45
3
0
14 Aug 2024
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages
Carlos Mullov
Ngoc-Quan Pham
Alexander Waibel
27
1
0
05 Aug 2024
In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation
Joel Witzke
Benoît Sagot
Rachel Bawden
36
7
0
01 Aug 2024
Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models
Kenza Benkirane
Laura Gongas
Shahar Pelles
Naomi Fuchs
Joshua Darmon
Pontus Stenetorp
David Ifeoluwa Adelani
Eduardo Sánchez
HILM
38
4
0
23 Jul 2024
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment
Yongxin Huang
Kexin Wang
Goran Glavavs
Iryna Gurevych
44
0
0
20 Jul 2024
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect
Salima Mdhaffar
Haroun Elleuch
Fethi Bougares
Yannick Esteve
49
0
0
05 Jul 2024
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation
Tiia Sildam
Andra Velve
Tanel Alumäe
33
0
0
04 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation
Rao Ma
Yassir Fathullah
Mengjie Qian
Siyuan Tang
Mark J. F. Gales
Kate Knill
20
1
0
01 Jul 2024
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
Abdul Waheed
Karima Kadaoui
Bhiksha Raj
Muhammad Abdul-Mageed
32
1
0
01 Jul 2024
Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment
Joseph Liu
Mahesh Kumar Nandwana
Janne Pylkkönen
Hannes Heikinheimo
Morgan McGuire
37
0
0
14 Jun 2024
Diffusion Synthesizer for Efficient Multilingual Speech to Speech Translation
Nameer Hirschkind
Xiao Yu
Mahesh Kumar Nandwana
Joseph Liu
Eloi DuBois
...
Colin Sinclair
Kyle Spence
Charles Shang
Zoë Abrams
Morgan McGuire
30
0
0
14 Jun 2024
Bridging Language Gaps in Audio-Text Retrieval
Zhiyong Yan
Heinrich Dinkel
Yongqing Wang
Jizhong Liu
Junbo Zhang
Yujun Wang
Bin Wang
VLM
27
4
0
11 Jun 2024
OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection
Chenyang Huang
Abbas Ghaddar
I. Kobyzev
Mehdi Rezagholizadeh
Osmar R. Zaiane
Boxing Chen
39
0
0
04 Jun 2024
Improving Multi-lingual Alignment Through Soft Contrastive Learning
Minsu Park
Seyeon Choi
Chanyeol Choi
Junseong Kim
Jy-yong Sohn
16
2
0
25 May 2024
SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
Marco Gaido
Sara Papi
Matteo Negri
Mauro Cettolo
L. Bentivogli
35
1
0
17 May 2024
Language-Independent Representations Improve Zero-Shot Summarization
V. Solovyev
Danni Liu
Jan Niehues
27
0
0
08 Apr 2024
Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems
Frank Palma Gomez
Ramon Sanabria
Yun-hsuan Sung
Daniel Matthew Cer
Siddharth Dalmia
Gustavo Hernández Ábrego
VLM
33
3
0
02 Apr 2024
A New Benchmark for Evaluating Automatic Speech Recognition in the Arabic Call Domain
Qusai Abo Obaidah
Muhy Eddin Za'ter
Adnan Jaljuli
Ali Mahboub
Asma Hakouz
Bashar Alfrou
Yazan Estaitia
18
1
0
07 Mar 2024
Towards Red Teaming in Multimodal and Multilingual Translation
C. Ropers
David Dale
Prangthip Hansanti
Gabriel Mejia Gonzalez
Ivan Evtimov
...
Kristina Pereyra
Seohyun Sonia Kim
Cristian Canton Ferrer
Pierre Yves Andrews
Marta R. Costa-jussá
LRM
28
2
0
29 Jan 2024
The Faiss library
Matthijs Douze
Alexandr Guzhva
Chengqi Deng
Jeff Johnson
Gergely Szilvasy
Pierre-Emmanuel Mazaré
Maria Lomeli
Lucas Hosseini
Hervé Jégou
30
145
0
16 Jan 2024
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT
Cheol Jun Cho
Abdelrahman Mohamed
Shang-Wen Li
Alan W. Black
Gopala K. Anumanchipalli
29
8
0
16 Oct 2023
Multimodal Modeling For Spoken Language Identification
Shikhar Bharadwaj
Min Ma
Shikhar Vashishth
Ankur Bapna
Sriram Ganapathy
...
Yu Zhang
D. Esch
Sandy Ritchie
Partha P. Talukdar
Jason Riesa
24
0
0
19 Sep 2023
HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation
David Dale
Elena Voita
Janice Lam
Prangthip Hansanti
C. Ropers
Elahe Kalbassi
Cynthia Gao
Loïc Barrault
Marta R. Costa-jussá
HILM
32
27
0
19 May 2023
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
78
282
0
25 May 2022
1