v1v2 (latest)

Language-agnostic BERT Sentence Embedding

3 July 2020

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "Language-agnostic BERT Sentence Embedding"

50 / 484 papers shown

M3DR: Towards Universal Multilingual Multimodal Document Retrieval

Adithya S Kolavi

Vyoman Jain

270

03 Dec 2025

A Hybrid Classical-Quantum Fine Tuned BERT for Text Classification

Abu Kaisar Mohammad Masum

Naveed Mahmud

M. H. Najafi

Sercan Aygün

145

21 Nov 2025

How Good is BLI as an Alignment Measure: A Study in Word Embedding Paradigm

Kasun Wickramasinghe

Nisansa de Silva

188

17 Nov 2025

Wikipedia-based Datasets in Russian Information Retrieval Benchmark RusBEIR

Grigory Kovalev

Natalia Loukachevitch

M. Tikhomirov

Olga Babina

Pavel Mamaev

159

07 Nov 2025

How to Evaluate Speech Translation with Source-Aware Neural MT Metrics

234

05 Nov 2025

The Riddle of Reflection: Evaluating Reasoning and Self-Awareness in Multilingual LLMs using Indian Riddles

Abhinav P M

Ojasva Saxena

Oswald Christopher

Parameswari Krishnamurthy

LRM

137

02 Nov 2025

FLORA: Unsupervised Knowledge Graph Alignment by Fuzzy Logic

Yiwen Peng

Thomas Bonald

Fabian M. Suchanek

163

23 Oct 2025

Assessing the Political Fairness of Multilingual LLMs: A Case Study based on a 21-way Multiparallel EuroParl Dataset

Paul Lerner

François Yvon

160

23 Oct 2025

Large language models for folktale type automation based on motifs: Cinderella case study

Tjaša Arčon

Marko Robnik-Šikonja

Polona Tratnik

21 Oct 2025

Lingua Custodi's participation at the WMT 2025 Terminology shared task

177

20 Oct 2025

AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages

157

20 Oct 2025

BiMax: Bidirectional MaxSim Score for Document-Level Alignment

Xiaotian Wang

T. Utsuro

Masaaki Nagata

146

17 Oct 2025

When Embedding Models Meet: Procrustes Bounds and Applications

Lucas Maystre

Alvaro Ortega Gonzalez

198

15 Oct 2025

A fully automated and scalable Parallel Data Augmentation for Low Resource Languages using Image and Text AnalyticsACM Symposium on Applied Computing (SAC), 2023

101

15 Oct 2025

ACADATA: Parallel Dataset of Academic Data for Machine Translation

Iñaki Lacunza

Javier García Gilabert

Francesca de Luca Fornaciari

Javier Aula-Blasco

Aitor Gonzalez-Agirre

Maite Melero

Marta Villegas

154

14 Oct 2025

Bridging the Semantic Gap: Contrastive Rewards for Multilingual Text-to-SQL with GRPO

183

10 Oct 2025

Mapping Semantic & Syntactic Relationships with Geometric Rotation

Michael Freenor

Lauren Alvarez

LLMSV

244

10 Oct 2025

Multilingual Generative Retrieval via Cross-lingual Semantic Compression

166

09 Oct 2025

Revisiting Metric Reliability for Fine-grained Evaluation of Machine Translation and Summarization in Indian Languages

146

08 Oct 2025

Milco: Learned Sparse Retrieval Across Languages via a Multilingual Connector

179

01 Oct 2025

Aligning LLMs for Multilingual Consistency in Enterprise Applications

Amit Agarwal

Hansa Meghwani

Hitesh Laxmichand Patel

Tao Sheng

Sujith Ravi

Dan Roth

474

28 Sep 2025

ADAM: A Diverse Archive of Mankind for Evaluating and Enhancing LLMs in Biographical Reasoning

134

26 Sep 2025

One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning

194

25 Sep 2025

Align Where the Words Look: Cross-Attention-Guided Patch Alignment with Contrastive and Transport Regularization for Bengali Captioning

Riad Ahmed Anonto

Sardar Md. Saffat Zabin

M. Saifur Rahman

VLM

191

22 Sep 2025

Speech Vecalign: an Embedding-based Method for Aligning Parallel Speech Documents

Chutong Meng

Philipp Koehn

149

22 Sep 2025

The Role of Vocabularies in Learning Sparse Representations for Ranking

Hiun Kim

Tae Kwan Lee

Taeryun Won

231

20 Sep 2025

Efficient and Versatile Model for Multilingual Information Retrieval of Islamic Text: Development and Deployment in Real-World Scenarios

Vera Pavlova

Mohammed Makhlouf

197

18 Sep 2025

Case-Based Decision-Theoretic Decoding with Quality Memories

Hiroyuki Deguchi

Masaaki Nagata

237

16 Sep 2025

A comparison of pipelines for the translation of a low resource language based on transformers

133

15 Sep 2025

SENSE models: an open source solution for multilingual and multimodal semantic-based tasks

199

15 Sep 2025

MTEB-NL and E5-NL: Embedding Benchmark and Models for Dutch

168

15 Sep 2025

A Survey of Long-Document Retrieval in the PLM and LLM Era

260

09 Sep 2025

L3Cube-MahaSTS: A Marathi Sentence Similarity Dataset and Models

Aishwarya Mirashi

Ananya Joshi

Raviraj Joshi

125

29 Aug 2025

Overview of BioASQ 2025: The Thirteenth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question AnsweringConference and Labs of the Evaluation Forum (CLEF), 2025

Miguel Rodríguez-Ortega

...

217

28 Aug 2025

OpenWHO: A Document-Level Parallel Corpus for Health Translation in Low-Resource Languages

348

22 Aug 2025

Filling the Gap for Uzbek: Creating Translation Resources for Southern Uzbek

Mukhammadsaid Mamasaidov

Azizullah Aral

Abror Shopulatov

Mironshoh Inomjonov

140

20 Aug 2025

From SALAMANDRA to SALAMANDRATA: BSC Submission for WMT25 General Machine Translation Shared Task

Javier García Gilabert

...

Miguel Claramunt Argote

Carlos Escolano

Maite Melero

192

18 Aug 2025

SEA-BED: How Do Embedding Models Represent Southeast Asian Languages?

Wuttikorn Ponwitayarat

...

Ekapol Chuangsuwanich

Sarana Nutanong

Peerat Limkonchotiwat

FedML

229

17 Aug 2025

Cross-lingual Aspect-Based Sentiment Analysis: A Survey on Tasks, Approaches, and ChallengesInformation Fusion (Inf. Fusion), 2025

Jakub Šmíd

Pavel Král

215

13 Aug 2025

VN-MTEB: Vietnamese Massive Text Embedding Benchmark

349

29 Jul 2025

Annotation-Assisted Learning of Treatment Policies From Multimodal Electronic Health Records

Henri Arno

Thomas Demeester

CML

345

28 Jul 2025

VIBE: Video-Input Brain Encoder for fMRI Response Modeling

Daniel Carlstrom Schad

278

23 Jul 2025

Evaluating Text Style Transfer: A Nine-Language Benchmark for Text Detoxification

161

21 Jul 2025

Causal Language Control in Multilingual Transformers via Sparse Feature Steering

263

17 Jul 2025

Factorized RVQ-GAN For Disentangled Speech Tokenization

...

263

18 Jun 2025

CMU's IWSLT 2025 Simultaneous Speech Translation SystemInternational Workshop on Spoken Language Translation (IWSLT), 2025

286

16 Jun 2025

Factors affecting the in-context learning abilities of LLMs for dialogue state tracking

204

10 Jun 2025

Static Word Embeddings for Sentence Semantic Representation

397

05 Jun 2025

MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and BaselinesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Dávid Javorský

Ondrej Bojar

François Yvon

329

05 Jun 2025

Building a Few-Shot Cross-Domain Multilingual NLU Model for Customer CareEuropean Conference on Artificial Intelligence (ECAI), 2025

188

04 Jun 2025