ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.01852
  4. Cited By
Language-agnostic BERT Sentence Embedding

Language-agnostic BERT Sentence Embedding

3 July 2020
Fangxiaoyu Feng
Yinfei Yang
Daniel Matthew Cer
N. Arivazhagan
Wei Wang
ArXivPDFHTML

Papers citing "Language-agnostic BERT Sentence Embedding"

50 / 75 papers shown
Title
Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
Maxime Bouthors
Josep Crego
François Yvon
RALM
LRM
44
0
0
30 Apr 2025
ALF: Advertiser Large Foundation Model for Multi-Modal Advertiser Understanding
ALF: Advertiser Large Foundation Model for Multi-Modal Advertiser Understanding
Santosh Rajagopalan
Jonathan Vronsky
Songbai Yan
S. Alireza Golestaneh
Shubhra Chandra
Min Zhou
66
0
0
26 Apr 2025
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Zhijun Wang
Jiahuan Li
Hao Zhou
Rongxiang Weng
J. Wang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Shujian Huang
LRM
48
1
0
02 Apr 2025
High-Dimensional Interlingual Representations of Large Language Models
High-Dimensional Interlingual Representations of Large Language Models
Bryan Wilie
Samuel Cahyawijaya
Junxian He
Pascale Fung
50
0
0
14 Mar 2025
Large Engagement Networks for Classifying Coordinated Campaigns and Organic Twitter Trends
Large Engagement Networks for Classifying Coordinated Campaigns and Organic Twitter Trends
Atul Anand Gopalakrishnan
Jakir Hossain
Tugrulcan Elmas
Ahmet Erdem Sariyuce
GNN
36
0
0
01 Mar 2025
Unsupervised Entity Alignment Based on Personalized Discriminative Rooted Tree
Unsupervised Entity Alignment Based on Personalized Discriminative Rooted Tree
Yaming Yang
Zhe Wang
Ziyu Guan
Wei Zhao
Xinyan Huang
Xiaofei He
59
0
0
14 Feb 2025
News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation
News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation
Andreea Iana
Fabian David Schmidt
Goran Glavas
Heiko Paulheim
63
3
0
20 Jan 2025
AIMA at SemEval-2024 Task 10: History-Based Emotion Recognition in Hindi-English Code-Mixed Conversations
AIMA at SemEval-2024 Task 10: History-Based Emotion Recognition in Hindi-English Code-Mixed Conversations
Mohammad Mahdi Abootorabi
Nona Ghazizadeh
Seyed Arshan Dalili
Alireza Ghahramani Kure
Mahshid Dehghani
Ehsaneddin Asgari
33
2
0
19 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
11
0
31 Dec 2024
Cogs in a Machine, Doing What They're Meant to Do -- The AMI Submission
  to the WMT24 General Translation Task
Cogs in a Machine, Doing What They're Meant to Do -- The AMI Submission to the WMT24 General Translation Task
Atli Jasonarson
Hinrik Hafsteinsson
Bjarki Ármannsson
Steinþór Steingrímsson
SyDa
27
2
0
04 Oct 2024
Mitigating Semantic Leakage in Cross-lingual Embeddings via
  Orthogonality Constraint
Mitigating Semantic Leakage in Cross-lingual Embeddings via Orthogonality Constraint
Dayeon Ki
Cheonbok Park
H. Kim
FedML
21
0
0
24 Sep 2024
Leveraging Entailment Judgements in Cross-Lingual Summarisation
Leveraging Entailment Judgements in Cross-Lingual Summarisation
Huajian Zhang
Laura Perez-Beltrachini
HILM
29
0
0
01 Aug 2024
Modular Sentence Encoders: Separating Language Specialization from
  Cross-Lingual Alignment
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment
Yongxin Huang
Kexin Wang
Goran Glavavs
Iryna Gurevych
44
0
0
20 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation
Cross-Lingual Transfer Learning for Speech Translation
Rao Ma
Yassir Fathullah
Mengjie Qian
Siyuan Tang
Mark J. F. Gales
Kate Knill
18
1
0
01 Jul 2024
Too Late to Train, Too Early To Use? A Study on Necessity and Viability
  of Low-Resource Bengali LLMs
Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs
Tamzeed Mahfuz
Satak Kumar Dey
Ruwad Naswan
Hasnaen Adil
Khondker Salman Sayeed
Haz Sameen Shahgir
29
0
0
29 Jun 2024
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of
  Multilingual and Monolingual Text Embedding
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
K. Enevoldsen
Márton Kardos
Niklas Muennighoff
Kristoffer Laigaard Nielbo
24
9
0
04 Jun 2024
Critical Learning Periods: Leveraging Early Training Dynamics for
  Efficient Data Pruning
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
29
4
0
29 May 2024
Neural Semantic Parsing with Extremely Rich Symbolic Meaning
  Representations
Neural Semantic Parsing with Extremely Rich Symbolic Meaning Representations
Xiao Zhang
Gosse Bouma
Johan Bos
NAI
17
0
0
19 Apr 2024
High-Dimension Human Value Representation in Large Language Models
High-Dimension Human Value Representation in Large Language Models
Samuel Cahyawijaya
Delong Chen
Yejin Bang
Leila Khalatbari
Bryan Wilie
Ziwei Ji
Etsuko Ishii
Pascale Fung
63
5
0
11 Apr 2024
Deep Learning-based Computational Job Market Analysis: A Survey on Skill
  Extraction and Classification from Job Postings
Deep Learning-based Computational Job Market Analysis: A Survey on Skill Extraction and Classification from Job Postings
Elena Senger
Mike Zhang
Rob van der Goot
Barbara Plank
21
7
0
08 Feb 2024
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
Yu Zhang
Mei Di
Haozheng Luo
Chenwei Xu
Richard Tzong-Han Tsai
52
7
0
22 Jan 2024
APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT
  Training Data Creation
APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT Training Data Creation
Akshay Batheja
S. Deoghare
Diptesh Kanojia
Pushpak Bhattacharyya
4
0
0
18 Dec 2023
From Generalized Laughter to Personalized Chuckles: Unleashing the Power
  of Data Fusion in Subjective Humor Detection
From Generalized Laughter to Personalized Chuckles: Unleashing the Power of Data Fusion in Subjective Humor Detection
Julita Bielaniewicz
Przemyslaw Kazienko
FedML
17
0
0
18 Dec 2023
KhabarChin: Automatic Detection of Important News in the Persian
  Language
KhabarChin: Automatic Detection of Important News in the Persian Language
Hamed Hematian Hemati
Arash Lagzian
M. S. Sartakhti
Hamid Beigy
Ehsaneddin Asgari
13
1
0
06 Dec 2023
RETSim: Resilient and Efficient Text Similarity
RETSim: Resilient and Efficient Text Similarity
Marina Zhang
Owen Vallis
Aysegul Bumin
Tanay Vakharia
Elie Bursztein
10
1
0
28 Nov 2023
SentAlign: Accurate and Scalable Sentence Alignment
SentAlign: Accurate and Scalable Sentence Alignment
Steinþór Steingrímsson
H. Loftsson
Andy Way
10
7
0
15 Nov 2023
Leveraging LLMs for Synthesizing Training Data Across Many Languages in
  Multilingual Dense Retrieval
Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval
Nandan Thakur
Jianmo Ni
Gustavo Hernández Ábrego
John Wieting
Jimmy J. Lin
Daniel Matthew Cer
RALM
21
12
0
10 Nov 2023
Temporally Aligning Long Audio Interviews with Questions: A Case Study
  in Multimodal Data Integration
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
P. Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
41
0
0
10 Oct 2023
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation
Zixi Zhang
Greg Chadwick
Hugo McNally
Yiren Zhao
Robert D. Mullins
Jianyi Cheng
Robert Mullins
Yiren Zhao
24
18
0
06 Oct 2023
FRASIMED: a Clinical French Annotated Resource Produced through
  Crosslingual BERT-Based Annotation Projection
FRASIMED: a Clinical French Annotated Resource Produced through Crosslingual BERT-Based Annotation Projection
Jamil Zaghir
Mina Bjelogrlic
J. Goldman
Soukaina Aananou
C. Gaudet-Blavignac
Christian Lovis
11
0
0
19 Sep 2023
Unsupervised Deep Cross-Language Entity Alignment
Unsupervised Deep Cross-Language Entity Alignment
Chuanyu Jiang
Yiming Qian
Lijun Chen
Yang Gu
Xia Xie
8
5
0
19 Sep 2023
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for
  Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Songbo Hu
Han Zhou
Mete Hergul
Milan Gritta
Guchun Zhang
Ignacio Iacobacci
Ivan Vulić
Anna Korhonen
24
10
0
26 Jul 2023
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Cihan Xiao
Henry Li Xinyuan
Jinyi Yang
Dongji Gao
Matthew Wiesner
Kevin Duh
Sanjeev Khudanpur
27
1
0
20 Jun 2023
A Comprehensive Survey of Sentence Representations: From the BERT Epoch
  to the ChatGPT Era and Beyond
A Comprehensive Survey of Sentence Representations: From the BERT Epoch to the ChatGPT Era and Beyond
Abhinav Ramesh Kashyap
Thang-Tung Nguyen
Viktor Schlegel
Stefan Winkler
See-Kiong Ng
Soujanya Poria
AI4TS
3DV
SSL
29
6
0
22 May 2023
ReSeTOX: Re-learning attention weights for toxicity mitigation in
  machine translation
ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation
Javier García Gilabert
Carlos Escolano
Marta R. Costa-jussá
CLL
MU
14
2
0
19 May 2023
A Fused Gromov-Wasserstein Framework for Unsupervised Knowledge Graph
  Entity Alignment
A Fused Gromov-Wasserstein Framework for Unsupervised Knowledge Graph Entity Alignment
Jianheng Tang
Kangfei Zhao
Jia Li
27
26
0
11 May 2023
Are the Best Multilingual Document Embeddings simply Based on Sentence
  Embeddings?
Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
Sonal Sannigrahi
Josef van Genabith
C. España-Bonet
AILaw
18
4
0
28 Apr 2023
Hallucinations in Large Multilingual Translation Models
Hallucinations in Large Multilingual Translation Models
Nuno M. Guerreiro
Duarte M. Alves
Jonas Waldendorf
Barry Haddow
Alexandra Birch
Pierre Colombo
André F.T. Martins
VLM
HILM
LRM
13
139
0
28 Mar 2023
Rediscovering Hashed Random Projections for Efficient Quantization of
  Contextualized Sentence Embeddings
Rediscovering Hashed Random Projections for Efficient Quantization of Contextualized Sentence Embeddings
Ulf A. Hamster
Ji-Ung Lee
Alexander Geyken
Iryna Gurevych
16
0
0
13 Mar 2023
Few-shot Multimodal Multitask Multilingual Learning
Few-shot Multimodal Multitask Multilingual Learning
Aman Chadha
Vinija Jain
34
0
0
19 Feb 2023
Zero and Few-Shot Localization of Task-Oriented Dialogue Agents with a
  Distilled Representation
Zero and Few-Shot Localization of Task-Oriented Dialogue Agents with a Distilled Representation
M. Moradshahi
Sina J. Semnani
M. Lam
21
7
0
18 Feb 2023
Modeling Sequential Sentence Relation to Improve Cross-lingual Dense
  Retrieval
Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval
Shunyu Zhang
Yaobo Liang
Ming Gong
Daxin Jiang
Nan Duan
16
4
0
03 Feb 2023
Improving Machine Translation with Phrase Pair Injection and Corpus
  Filtering
Improving Machine Translation with Phrase Pair Injection and Corpus Filtering
Akshay Batheja
P. Bhattacharyya
17
15
0
19 Jan 2023
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for
  Natural Language Understanding in Task-Oriented Dialogue
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue
Nikita Moghe
E. Razumovskaia
Liane Guillou
Ivan Vulić
Anna Korhonen
Alexandra Birch
19
13
0
20 Dec 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual
  Speech-to-Speech Translations
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
21
34
0
08 Nov 2022
Very Low Resource Sentence Alignment: Luhya and Swahili
Very Low Resource Sentence Alignment: Luhya and Swahili
E. Chimoto
Bruce A. Bassett
CVBM
11
10
0
31 Oct 2022
The University of Edinburgh's Submission to the WMT22 Code-Mixing Shared
  Task (MixMT)
The University of Edinburgh's Submission to the WMT22 Code-Mixing Shared Task (MixMT)
Faheem Kirefu
Vivek Iyer
Pinzhen Chen
Laurie Burchell
MoE
21
1
0
20 Oct 2022
AugCSE: Contrastive Sentence Embedding with Diverse Augmentations
AugCSE: Contrastive Sentence Embedding with Diverse Augmentations
Zilu Tang
Muhammed Yusuf Kocyigit
Derry Wijaya
18
8
0
20 Oct 2022
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual
  Text-Video Retrieval
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Andrew Rouditchenko
Yung-Sung Chuang
Nina Shvetsova
Samuel Thomas
Rogerio Feris
Brian Kingsbury
Leonid Karlinsky
David F. Harwath
Hilde Kuehne
James R. Glass
VLM
21
4
0
07 Oct 2022
The first neural machine translation system for the Erzya language
The first neural machine translation system for the Erzya language
David Dale
52
7
0
19 Sep 2022
12
Next