Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.10821
Cited By
To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation
22 July 2021
Tom Kocmi
C. Federmann
Roman Grundkiewicz
Marcin Junczys-Dowmunt
Hitokazu Matsushita
Arul Menezes
Re-assign community
ArXiv
PDF
HTML
Papers citing
"To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation"
50 / 127 papers shown
Title
xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection
Nuno M. Guerreiro
Ricardo Rei
Daan van Stigt
Luísa Coheur
Pierre Colombo
André F.T. Martins
40
111
0
16 Oct 2023
SLIDE: Reference-free Evaluation for Machine Translation using a Sliding Document Window
Vikas Raunak
Tom Kocmi
Matt Post
25
6
0
16 Sep 2023
Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level
Daniel Deutsch
Juraj Juraska
M. Finkelstein
and Markus Freitag
41
11
0
25 Aug 2023
Efficient Benchmarking of Language Models
Yotam Perlitz
Elron Bandel
Ariel Gera
Ofir Arviv
L. Ein-Dor
Eyal Shnarch
Noam Slonim
Michal Shmueli-Scheuer
Leshem Choshen
ALM
11
24
0
22 Aug 2023
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Patrick Fernandes
Daniel Deutsch
M. Finkelstein
Parker Riley
André F. T. Martins
Graham Neubig
Ankush Garg
J. Clark
Markus Freitag
Orhan Firat
LRM
34
66
0
14 Aug 2023
Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG Evaluation
Xianfeng Zeng
Yanjun Liu
Fandong Meng
Jie Zhou
19
0
0
06 Aug 2023
BLEURT Has Universal Translations: An Analysis of Automatic Metrics by Minimum Risk Training
Yiming Yan
Tao Wang
Chengqi Zhao
Shujian Huang
Jiajun Chen
Mingxuan Wang
19
22
0
06 Jul 2023
Benchmarking Large Language Model Capabilities for Conditional Generation
Joshua Maynez
Priyanka Agrawal
Sebastian Gehrmann
ELM
LM&MA
25
28
0
29 Jun 2023
xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
Mingda Chen
Kevin Heffernan
Onur cCelebi
Alexandre Mourachko
Holger Schwenk
18
3
0
22 Jun 2023
Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine Translation Assessment
Hao-Yu Yang
Min Zhang
Shimin Tao
Minghan Wang
Daimeng Wei
Yanfei Jiang
LRM
15
10
0
13 Jun 2023
Good, but not always Fair: An Evaluation of Gender Bias for three commercial Machine Translation Systems
Silvia Alma Piazzolla
Beatrice Savoldi
L. Bentivogli
28
2
0
09 Jun 2023
Correction of Errors in Preference Ratings from Automated Metrics for Text Generation
Jan Deriu
Pius von Daniken
Don Tuggener
Mark Cieliebak
19
2
0
06 Jun 2023
Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation
Josef Jon
Ondrej Bojar
15
10
0
30 May 2023
Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References
Tianyi Tang
Hongyuan Lu
Yuchen Eleanor Jiang
Haoyang Huang
Dongdong Zhang
Wayne Xin Zhao
Tom Kocmi
Furu Wei
15
4
0
24 May 2023
Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration
Daniel Deutsch
George F. Foster
Markus Freitag
19
41
0
23 May 2023
When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale
Christos Baziotis
Biao Zhang
Alexandra Birch
Barry Haddow
30
2
0
23 May 2023
Syntactic Knowledge via Graph Attention with BERT in Machine Translation
Yuqian Dai
S. Sharoff
M. Kamps
16
1
0
22 May 2023
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation
Elizabeth Clark
Shruti Rijhwani
Sebastian Gehrmann
Joshua Maynez
Roee Aharoni
Vitaly Nikolaev
Thibault Sellam
Aditya Siddhant
Dipanjan Das
Ankur P. Parikh
14
38
0
22 May 2023
Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation
Wen Lai
Alexandra Chronopoulou
Alexander M. Fraser
30
4
0
22 May 2023
Pseudo-Label Training and Model Inertia in Neural Machine Translation
B. Hsu
Anna Currey
Xing Niu
Maria Nuadejde
Georgiana Dinu
ODL
23
2
0
19 May 2023
The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics
Ricardo Rei
Nuno M. Guerreiro
Marcos Vinícius Treviso
Luísa Coheur
A. Lavie
André F.T. Martins
27
15
0
19 May 2023
What's the Meaning of Superhuman Performance in Today's NLU?
Simone Tedeschi
Johan Bos
T. Declerck
Jan Hajic
Daniel Hershcovich
...
Simon Krek
Steven Schockaert
Rico Sennrich
Ekaterina Shutova
Roberto Navigli
ELM
LM&MA
VLM
ReLM
LRM
24
26
0
15 May 2023
Exploring Human-Like Translation Strategy with Large Language Models
Zhiwei He
Tian Liang
Wenxiang Jiao
Zhuosheng Zhang
Yujiu Yang
Rui Wang
Zhaopeng Tu
Shuming Shi
Xing Wang
24
39
0
06 May 2023
SLTUNET: A Simple Unified Model for Sign Language Translation
Biao Zhang
Mathias Müller
Rico Sennrich
SLR
40
33
0
02 May 2023
ICE-Score: Instructing Large Language Models to Evaluate Code
Terry Yue Zhuo
ELM
ALM
39
38
0
27 Apr 2023
Multidimensional Evaluation for Text Style Transfer Using ChatGPT
Huiyuan Lai
Antonio Toral
Malvina Nissim
18
17
0
26 Apr 2023
Summary of ChatGPT-Related Research and Perspective Towards the Future of Large Language Models
Yi-Hsien Liu
Tianle Han
Siyuan Ma
Jia-Yu Zhang
Yuanyu Yang
...
Xiang Li
Ning Qiang
Dingang Shen
Tianming Liu
Bao Ge
ALM
ELM
AI4CE
LM&MA
LLMAG
26
458
0
04 Apr 2023
Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Alex Jones
Isaac Caswell
Ishan Saxena
Orhan Firat
21
8
0
27 Mar 2023
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models
Qingyu Lu
Baopu Qiu
Liang Ding
Liping Xie
Tom Kocmi
Dacheng Tao
LRM
ALM
ELM
19
106
0
24 Mar 2023
Large Language Models Are State-of-the-Art Evaluators of Translation Quality
Tom Kocmi
C. Federmann
ELM
37
331
0
28 Feb 2023
Toward a Theory of Causation for Interpreting Neural Code Models
David Nader-Palacio
Alejandro Velasco
Nathan Cooper
Á. Rodríguez
Kevin Moran
Denys Poshyvanyk
13
16
0
07 Feb 2023
Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages
Simeng Sun
Maha Elbayad
Anna Y. Sun
James Cross
CLL
LRM
24
3
0
07 Feb 2023
The unreasonable effectiveness of few-shot learning for machine translation
Xavier Garcia
Yamini Bansal
Colin Cherry
George F. Foster
M. Krikun
Fan Feng
Melvin Johnson
Orhan Firat
27
102
0
02 Feb 2023
Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics Without the Reference
Vilém Zouhar
S. Dhuliawala
Wangchunshu Zhou
Nico Daheim
Tom Kocmi
Yuchen Eleanor Jiang
Mrinmaya Sachan
16
9
0
21 Jan 2023
Extrinsic Evaluation of Machine Translation Metrics
Nikita Moghe
Tom Sherborne
Mark Steedman
Alexandra Birch
ELM
11
17
0
20 Dec 2022
IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
Ananya B. Sai
Vignesh Nagarajan
Tanay Dixit
Raj Dabre
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
39
21
0
20 Dec 2022
LENS: A Learnable Evaluation Metric for Text Simplification
Mounica Maddela
Yao Dou
David Heineman
Wei-ping Xu
27
62
0
19 Dec 2022
Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better
David Dale
Elena Voita
Loïc Barrault
Marta R. Costa-jussá
HILM
27
67
0
16 Dec 2022
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
Mingda Chen
Paul-Ambroise Duquenne
Pierre Yves Andrews
Justine T. Kao
Alexandre Mourachko
Holger Schwenk
Marta R. Costa-jussá
14
17
0
16 Dec 2022
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding
Jianhao Yan
Jin Xu
Fandong Meng
Jie Zhou
Yue Zhang
16
3
0
08 Dec 2022
Considerations for meaningful sign language machine translation based on glosses
Mathias Müller
Zifan Jiang
Amit Moryossef
Annette Rios Gonzales
Sarah Ebling
SLR
22
37
0
28 Nov 2022
Prompting PaLM for Translation: Assessing Strategies and Performance
David Vilar
Markus Freitag
Colin Cherry
Jiaming Luo
Viresh Ratnakar
George F. Foster
LRM
19
152
0
16 Nov 2022
ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics
Chantal Amrhein
Nikita Moghe
Liane Guillou
ELM
23
22
0
27 Oct 2022
DEMETR: Diagnosing Evaluation Metrics for Translation
Marzena Karpinska
N. Raj
Katherine Thai
Yixiao Song
Ankita Gupta
Mohit Iyyer
21
36
0
25 Oct 2022
Searching for a higher power in the human evaluation of MT
Johnny Tian-Zheng Wei
Tom Kocmi
C. Federmann
6
6
0
20 Oct 2022
Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text
Marwa Gaser
Manuel Mager
Injy Hamed
Nizar Habash
Slim Abdennadher
Ngoc Thang Vu
23
6
0
11 Oct 2022
From Zero to Production: Baltic-Ukrainian Machine Translation Systems to Aid Refugees
Toms Bergmanis
Marcis Pinnis
12
1
0
28 Sep 2022
Embarrassingly Easy Document-Level MT Metrics: How to Convert Any Pretrained Metric Into a Document-Level Metric
Giorgos Vernikos
Brian Thompson
Prashant Mathur
Marcello Federico
36
40
0
27 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
14
2
0
16 Sep 2022
Rethinking Round-Trip Translation for Machine Translation Evaluation
Terry Yue Zhuo
Qiongkai Xu
Xuanli He
Trevor Cohn
LRM
22
2
0
15 Sep 2022
Previous
1
2
3
Next