ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.11906
  4. Cited By
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings

Effective Parallel Corpus Mining using Bilingual Sentence Embeddings

31 July 2018
Mandy Guo
Qinlan Shen
Yinfei Yang
Heming Ge
Daniel Matthew Cer
Gustavo Hernández Ábrego
K. Stevens
Noah Constant
Yun-hsuan Sung
B. Strope
R. Kurzweil
ArXivPDFHTML

Papers citing "Effective Parallel Corpus Mining using Bilingual Sentence Embeddings"

50 / 64 papers shown
Title
MEXMA: Token-level objectives improve sentence representations
MEXMA: Token-level objectives improve sentence representations
Joao Maria Janeiro
Benjamin Piwowarski
Patrick Gallinari
Loïc Barrault
26
1
0
19 Sep 2024
Learning Job Title Representation from Job Description Aggregation
  Network
Learning Job Title Representation from Job Description Aggregation Network
Napat Laosaengpha
Thanit Tativannarat
Chawan Piansaddhayanon
Attapol Rutherford
E. Chuangsuwanich
27
1
0
12 Jun 2024
MINERS: Multilingual Language Models as Semantic Retrievers
MINERS: Multilingual Language Models as Semantic Retrievers
Genta Indra Winata
Ruochen Zhang
David Ifeoluwa Adelani
RALM
47
5
0
11 Jun 2024
Critical Learning Periods: Leveraging Early Training Dynamics for
  Efficient Data Pruning
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
39
4
0
29 May 2024
Enhancing Cross-lingual Sentence Embedding for Low-resource Languages
  with Word Alignment
Enhancing Cross-lingual Sentence Embedding for Low-resource Languages with Word Alignment
Zhongtao Miao
Qiyu Wu
Kaiyan Zhao
Zilong Wu
Yoshimasa Tsuruoka
28
9
0
03 Apr 2024
Does Negative Sampling Matter? A Review with Insights into its Theory
  and Applications
Does Negative Sampling Matter? A Review with Insights into its Theory and Applications
Zhen Yang
Ming Ding
Tinglin Huang
Yukuo Cen
Junshuai Song
Bin Xu
Yuxiao Dong
Jie Tang
30
9
0
27 Feb 2024
EMMA-X: An EM-like Multilingual Pre-training Algorithm for Cross-lingual
  Representation Learning
EMMA-X: An EM-like Multilingual Pre-training Algorithm for Cross-lingual Representation Learning
Ping Guo
Xiangpeng Wei
Yue Hu
Baosong Yang
Dayiheng Liu
Fei Huang
Jun Xie
21
2
0
26 Oct 2023
The Effect of Alignment Objectives on Code-Switching Translation
The Effect of Alignment Objectives on Code-Switching Translation
Mohamed Anwar
13
1
0
10 Sep 2023
Learning Multilingual Sentence Representations with Cross-lingual
  Consistency Regularization
Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization
Pengzhi Gao
Liwen Zhang
Zhongjun He
Hua-Hong Wu
Haifeng Wang
21
5
0
12 Jun 2023
Dual-Alignment Pre-training for Cross-lingual Sentence Embedding
Dual-Alignment Pre-training for Cross-lingual Sentence Embedding
Ziheng Li
Shaohan Huang
Zi-qiang Zhang
Zhi-Hong Deng
Qiang Lou
Haizhen Huang
Jian Jiao
Furu Wei
Weiwei Deng
Qi Zhang
28
10
0
16 May 2023
LEALLA: Learning Lightweight Language-agnostic Sentence Embeddings with
  Knowledge Distillation
LEALLA: Learning Lightweight Language-agnostic Sentence Embeddings with Knowledge Distillation
Zhuoyuan Mao
Tetsuji Nakagawa
FedML
16
19
0
16 Feb 2023
Beyond Contrastive Learning: A Variational Generative Model for
  Multilingual Retrieval
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval
John Wieting
J. Clark
William W. Cohen
Graham Neubig
Taylor Berg-Kirkpatrick
16
6
0
21 Dec 2022
Very Low Resource Sentence Alignment: Luhya and Swahili
Very Low Resource Sentence Alignment: Luhya and Swahili
E. Chimoto
Bruce A. Bassett
CVBM
11
10
0
31 Oct 2022
Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
  Representation
Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation
Deng Cai
Xin Li
Jackie Chun-Sing Ho
Lidong Bing
W. Lam
18
6
0
18 Oct 2022
EMS: Efficient and Effective Massively Multilingual Sentence Embedding
  Learning
EMS: Efficient and Effective Massively Multilingual Sentence Embedding Learning
Zhuoyuan Mao
Chenhui Chu
Sadao Kurohashi
37
1
0
31 May 2022
Bitext Mining Using Distilled Sentence Representations for Low-Resource
  Languages
Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages
Kevin Heffernan
Onur cCelebi
Holger Schwenk
25
53
0
25 May 2022
All You May Need for VQA are Image Captions
All You May Need for VQA are Image Captions
Soravit Changpinyo
Doron Kukliansky
Idan Szpektor
Xi Chen
Nan Ding
Radu Soricut
32
70
0
04 May 2022
How can NLP Help Revitalize Endangered Languages? A Case Study and
  Roadmap for the Cherokee Language
How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language
Shiyue Zhang
B. Frey
Mohit Bansal
9
35
0
25 Apr 2022
MuCoT: Multilingual Contrastive Training for Question-Answering in
  Low-resource Languages
MuCoT: Multilingual Contrastive Training for Question-Answering in Low-resource Languages
Gokul Karthik Kumar
Abhishek Singh Gehlot
Sahal Shaji Mullappilly
Karthik Nandakumar
26
13
0
12 Apr 2022
Quality Controlled Paraphrase Generation
Quality Controlled Paraphrase Generation
Elron Bandel
R. Aharonov
Michal Shmueli-Scheuer
Ilya Shnayderman
Noam Slonim
L. Ein-Dor
19
38
0
21 Mar 2022
USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics
  for Machine Translation
USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics for Machine Translation
Jonas Belouadi
Steffen Eger
31
20
0
21 Feb 2022
Improve Sentence Alignment by Divide-and-conquer
Improve Sentence Alignment by Divide-and-conquer
Wu Zhang
11
0
0
18 Jan 2022
On Cross-Lingual Retrieval with Multilingual Text Encoders
On Cross-Lingual Retrieval with Multilingual Text Encoders
Robert Litschko
Ivan Vulić
Simone Paolo Ponzetto
Goran Glavavs
11
38
0
21 Dec 2021
CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+
  Language Pairs
CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs
Abhik Bhattacharjee
Tahmid Hasan
Wasi Uddin Ahmad
Yuan-Fang Li
Yong-Bin Kang
Rifat Shahriyar
RALM
ELM
34
37
0
16 Dec 2021
Improved Multilingual Language Model Pretraining for Social Media Text
  via Translation Pair Prediction
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction
Shubhanshu Mishra
A. Haghighi
VLM
18
4
0
20 Oct 2021
Secoco: Self-Correcting Encoding for Neural Machine Translation
Secoco: Self-Correcting Encoding for Neural Machine Translation
Tao Wang
Chengqi Zhao
Mingxuan Wang
Lei Li
Hang Li
Deyi Xiong
VLM
17
3
0
27 Aug 2021
Neural Machine Translation for Low-Resource Languages: A Survey
Neural Machine Translation for Low-Resource Languages: A Survey
Surangika Ranathunga
E. Lee
Marjana Prifti Skenduli
Ravi Shekhar
Mehreen Alam
Rishemjit Kaur
27
234
0
29 Jun 2021
LAWDR: Language-Agnostic Weighted Document Representations from
  Pre-trained Models
LAWDR: Language-Agnostic Weighted Document Representations from Pre-trained Models
Hongyu Gong
Vishrav Chaudhary
Yuqing Tang
Francisco Guzmán
19
3
0
07 Jun 2021
Lightweight Cross-Lingual Sentence Representation Learning
Lightweight Cross-Lingual Sentence Representation Learning
Zhuoyuan Mao
Prakhar Gupta
Pei Wang
Chenhui Chu
Martin Jaggi
Sadao Kurohashi
VLM
16
8
0
28 May 2021
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining
Ivana Kvapilíková
Mikel Artetxe
Gorka Labaka
Eneko Agirre
Ondrej Bojar
SSL
11
36
0
21 May 2021
Paraphrastic Representations at Scale
Paraphrastic Representations at Scale
John Wieting
Kevin Gimpel
Graham Neubig
Taylor Berg-Kirkpatrick
19
19
0
30 Apr 2021
"Wikily" Supervised Neural Translation Tailored to Cross-Lingual Tasks
"Wikily" Supervised Neural Translation Tailored to Cross-Lingual Tasks
Mohammad Sadegh Rasooli
Chris Callison-Burch
Derry Wijaya
CLIP
13
5
0
16 Apr 2021
Low-Resource Machine Translation Training Curriculum Fit for
  Low-Resource Languages
Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages
Garry Kuwanto
Afra Feyza Akyürek
Isidora Chara Tourni
Siyang Li
Alex Jones
Derry Wijaya
11
5
0
24 Mar 2021
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for
  Improved Cross-Modal Retrieval
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval
Gregor Geigle
Jonas Pfeiffer
Nils Reimers
Ivan Vulić
Iryna Gurevych
27
59
0
22 Mar 2021
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual
  Retrieval
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval
Robert Litschko
Ivan Vulić
Simone Paolo Ponzetto
Goran Glavavs
13
23
0
21 Jan 2021
Bilingual Lexicon Induction via Unsupervised Bitext Construction and
  Word Alignment
Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment
Freda Shi
Luke Zettlemoyer
Sida I. Wang
SSL
24
32
0
01 Jan 2021
Neural Passage Retrieval with Improved Negative Contrast
Neural Passage Retrieval with Improved Negative Contrast
Jing Lu
Gustavo Hernández Ábrego
Ji Ma
Jianmo Ni
Yinfei Yang
21
25
0
23 Oct 2020
Unsupervised Bitext Mining and Translation via Self-trained Contextual
  Embeddings
Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings
Phillip Keung
Julian Salazar
Y. Lu
Noah A. Smith
SSL
25
25
0
15 Oct 2020
Semantic Label Smoothing for Sequence to Sequence Problems
Semantic Label Smoothing for Sequence to Sequence Problems
Michal Lukasik
Himanshu Jain
A. Menon
Seungyeon Kim
Srinadh Bhojanapalli
Felix X. Yu
Sanjiv Kumar
AI4TS
12
18
0
15 Oct 2020
ComStreamClust: a communicative multi-agent approach to text clustering
  in streaming data
ComStreamClust: a communicative multi-agent approach to text clustering in streaming data
Ali Najafi
Araz Gholipour-Shilabin
Rahim Dehkharghani
Ali Mohammadpur-Fard
M. Asgari-Chenaghlu
8
1
0
11 Oct 2020
Multilevel Text Alignment with Cross-Document Attention
Multilevel Text Alignment with Cross-Document Attention
Xuhui Zhou
Nikolaos Pappas
Noah A. Smith
22
18
0
03 Oct 2020
Neural Retrieval for Question Answering with Cross-Attention Supervised
  Data Augmentation
Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation
Yinfei Yang
Ning Jin
Kuo Lin
Mandy Guo
Daniel Matthew Cer
11
31
0
29 Sep 2020
Language-agnostic BERT Sentence Embedding
Language-agnostic BERT Sentence Embedding
Fangxiaoyu Feng
Yinfei Yang
Daniel Matthew Cer
N. Arivazhagan
Wei Wang
14
869
0
03 Jul 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
28
72
0
16 Jun 2020
MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering
  Models
MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models
Mandy Guo
Yinfei Yang
Daniel Matthew Cer
Qinlan Shen
Noah Constant
LRM
20
46
0
05 May 2020
Exploiting Sentence Order in Document Alignment
Exploiting Sentence Order in Document Alignment
Brian Thompson
Philipp Koehn
14
19
0
30 Apr 2020
Making Monolingual Sentence Embeddings Multilingual using Knowledge
  Distillation
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
Nils Reimers
Iryna Gurevych
20
994
0
21 Apr 2020
Contextual Lensing of Universal Sentence Representations
Contextual Lensing of Universal Sentence Representations
J. Kiros
8
5
0
20 Feb 2020
Massively Multilingual Document Alignment with Cross-lingual
  Sentence-Mover's Distance
Massively Multilingual Document Alignment with Cross-lingual Sentence-Mover's Distance
Ahmed El-Kishky
Francisco Guzmán
13
15
0
31 Jan 2020
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Holger Schwenk
Guillaume Wenzek
Sergey Edunov
Edouard Grave
Armand Joulin
25
254
0
10 Nov 2019
12
Next