Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.13809
Cited By
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models
24 March 2023
Qingyu Lu
Baopu Qiu
Liang Ding
Liping Xie
Tom Kocmi
Dacheng Tao
LRM
ALM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models"
50 / 69 papers shown
Title
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
Baban Gain
Dibyanayan Bandyopadhyay
Asif Ekbal
LM&MA
52
0
0
02 Apr 2025
GRP: Goal-Reversed Prompting for Zero-Shot Evaluation with LLMs
Mingyang Song
Mao Zheng
Xuan Luo
LRM
58
0
0
08 Mar 2025
LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts
Helia Hashemi
J. Eisner
Corby Rosset
Benjamin Van Durme
Chris Kedzie
61
1
0
03 Jan 2025
SpeechQE: Estimating the Quality of Direct Speech Translation
HyoJung Han
Kevin Duh
Marine Carpuat
18
0
0
28 Oct 2024
Findings of the WMT 2024 Shared Task on Chat Translation
Wafaa Mohammed
Sweta Agrawal
M. Amin Farajian
Vera Cabarrão
Bryan Eikema
Ana C. Farinha
José G. C. de Souza
19
3
0
15 Oct 2024
Realizing Video Summarization from the Path of Language-based Semantic Understanding
Kuan-Chen Mu
Zhi-Yi Chin
Wei-Chen Chiu
13
0
0
06 Oct 2024
The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests
Lior Madmoni
Amir Zait
Ilia Labzovsky
Danny Karmon
ELM
23
0
0
22 Sep 2024
MegaAgent: A Practical Framework for Autonomous Cooperation in Large-Scale LLM Agent Systems
Qian Wang
Tianyu Wang
Qinbin Li
Jingsheng Liang
Bingsheng He
LLMAG
AIFin
32
6
0
19 Aug 2024
Questionnaires for Everyone: Streamlining Cross-Cultural Questionnaire Adaptation with GPT-Based Translation Quality Evaluation
Otso Haavisto
Robin Welsch
18
0
0
30 Jul 2024
Are Generative Language Models Multicultural? A Study on Hausa Culture and Emotions using ChatGPT
Ibrahim Said Ahmad
Shiran Dudy
R. Ramachandranpillai
Kenneth Church
13
4
0
27 Jun 2024
Themis: Towards Flexible and Interpretable NLG Evaluation
Xinyu Hu
Li Lin
Mingqi Gao
Xunjian Yin
Xiaojun Wan
ELM
25
6
0
26 Jun 2024
MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language
Shun Wang
Ge Zhang
Han Wu
Tyler Loakman
Wenhao Huang
Chenghua Lin
35
2
0
19 Jun 2024
Uncertainty Aware Learning for Language Model Alignment
Yikun Wang
Rui Zheng
Liang Ding
Qi Zhang
Dahua Lin
Dacheng Tao
37
3
0
07 Jun 2024
Revisiting Catastrophic Forgetting in Large Language Model Tuning
Hongyu Li
Liang Ding
Meng Fang
Dacheng Tao
CLL
KELM
32
15
0
07 Jun 2024
Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning
Tianle Xia
Liang Ding
Guojia Wan
Yibing Zhan
Bo Du
Dacheng Tao
LRM
21
0
0
02 May 2024
Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning
Changtong Zan
Liang Ding
Li Shen
Yibing Zhen
Weifeng Liu
Dacheng Tao
36
9
0
21 Mar 2024
From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation
Haofei Zhao
Yilun Liu
Shimin Tao
Weibin Meng
Yimeng Chen
Xiang Geng
Chang Su
Min Zhang
Hao Yang
16
1
0
21 Mar 2024
Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction
Ziyang Xu
Keqin Peng
Liang Ding
Dacheng Tao
Xiliang Lu
25
9
0
15 Mar 2024
Is Context Helpful for Chat Translation Evaluation?
Sweta Agrawal
Amin Farajian
Patrick Fernandes
Ricardo Rei
André F.T. Martins
43
6
0
13 Mar 2024
TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement
Zhaopeng Feng
Yan Zhang
Hao Li
Bei Wu
Jiayu Liao
Wenqiang Liu
Jun Lang
Yang Feng
Jian Wu
Zuozhu Liu
LRM
25
9
0
26 Feb 2024
HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition
Yuxuan Liu
Tianchi Yang
Shaohan Huang
Zihan Zhang
Haizhen Huang
Furu Wei
Weiwei Deng
Feng Sun
Qi Zhang
18
12
0
24 Feb 2024
Leveraging Large Language Models for Concept Graph Recovery and Question Answering in NLP Education
Rui Yang
Boming Yang
Sixun Ouyang
Tianwei She
Aosong Feng
Yuang Jiang
Freddy Lecue
Jinghui Lu
Irene Z Li
AI4Ed
16
5
0
22 Feb 2024
Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation
Zhiyao Ren
Yibing Zhan
Baosheng Yu
Liang Ding
Dacheng Tao
LM&MA
24
12
0
20 Feb 2024
Revisiting Knowledge Distillation for Autoregressive Language Models
Qihuang Zhong
Liang Ding
Li Shen
Juhua Liu
Bo Du
Dacheng Tao
KELM
26
15
0
19 Feb 2024
ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
LM&MA
26
22
0
19 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
16
152
0
06 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
50
28
0
02 Feb 2024
Revisiting Demonstration Selection Strategies in In-Context Learning
Keqin Peng
Liang Ding
Yancheng Yuan
Xuebo Liu
Min Zhang
Y. Ouyang
Dacheng Tao
17
20
0
22 Jan 2024
Gender Bias in Machine Translation and The Era of Large Language Models
Eva Vanmassenhove
AILaw
11
1
0
18 Jan 2024
Gradable ChatGPT Translation Evaluation
Hui Jiao
Bei Peng
Lu Zong
Xiaojun Zhang
Xinwei Li
20
1
0
18 Jan 2024
Machine Translation with Large Language Models: Prompt Engineering for Persian, English, and Russian Directions
Nooshin Pourkamali
Shler Ebrahim Sharifi
LRM
39
9
0
16 Jan 2024
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MA
ELM
23
9
0
13 Jan 2024
OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models
Shuai Wang
Liang Ding
Li Shen
Yong Luo
Bo Du
Dacheng Tao
ELM
ALM
23
2
0
12 Jan 2024
Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation
Xu Huang
Zhirui Zhang
Xiang Geng
Yichao Du
Jiajun Chen
Shujian Huang
32
7
0
12 Jan 2024
Convergences and Divergences between Automatic Assessment and Human Evaluation: Insights from Comparing ChatGPT-Generated Translation and Neural Machine Translation
Zhaokun Jiang
Ziyin Zhang
EGVM
8
3
0
10 Jan 2024
Can ChatGPT be Your Personal Medical Assistant?
Md. Rafiul Biswas
Ashhadul Islam
Zubair Shah
Wajdi Zaghouani
S. Belhaouari
AI4MH
LM&MA
ELM
8
3
0
19 Dec 2023
Distinguishing Translations by Human, NMT, and ChatGPT: A Linguistic and Statistical Approach
Zhaokun Jiang
Qianxi Lv
Ziyin Zhang
6
1
0
17 Dec 2023
ACES: Translation Accuracy Challenge Sets at WMT 2023
Chantal Amrhein
Nikita Moghe
Liane Guillou
ELM
9
3
0
02 Nov 2023
The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Christoph Leiter
Juri Opitz
Daniel Deutsch
Yang Gao
Rotem Dror
Steffen Eger
ALM
LRM
ELM
19
31
0
30 Oct 2023
GEMBA-MQM: Detecting Translation Quality Error Spans with GPT-4
Tom Kocmi
C. Federmann
13
72
0
21 Oct 2023
Well Begun is Half Done: Generator-agnostic Knowledge Pre-Selection for Knowledge-Grounded Dialogue
Lang Qin
Yao Zhang
Hongru Liang
Jun Wang
Zhenglu Yang
6
2
0
11 Oct 2023
TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
Dongfu Jiang
Yishan Li
Ge Zhang
Wenhao Huang
Bill Yuchen Lin
Wenhu Chen
ALM
29
57
0
01 Oct 2023
SocREval: Large Language Models with the Socratic Method for Reference-Free Reasoning Evaluation
Hangfeng He
Hongming Zhang
Dan Roth
LRM
ELM
ReLM
23
5
0
29 Sep 2023
Calibrating LLM-Based Evaluator
Yuxuan Liu
Tianchi Yang
Shaohan Huang
Zihan Zhang
Haizhen Huang
Furu Wei
Weiwei Deng
Feng Sun
Qi Zhang
25
31
0
23 Sep 2023
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems
Leonardo Ranaldi
Fabio Massimo Zanzotto
13
1
0
21 Sep 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Qingyue Wang
Y. Fu
Yanan Cao
Zhiliang Tian
Shi Wang
Dacheng Tao
LLMAG
KELM
RALM
39
22
0
29 Aug 2023
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?
Fei-Yue Wang
Liang Ding
Jun Rao
Ye Liu
Li Shen
Changxing Ding
12
15
0
24 Aug 2023
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Patrick Fernandes
Daniel Deutsch
M. Finkelstein
Parker Riley
André F. T. Martins
Graham Neubig
Ankush Garg
J. Clark
Markus Freitag
Orhan Firat
LRM
29
39
0
14 Aug 2023
AutoPCF: Efficient Product Carbon Footprint Accounting with Large Language Models
Z. Deng
Jinjie Liu
Biao Luo
Can Yuan
Qingrun Yang
Lei Xiao
Wenwen Zhou
Zhuiguo Liu
6
2
0
08 Aug 2023
TARJAMAT: Evaluation of Bard and ChatGPT on Machine Translation of Ten Arabic Varieties
Karima Kadaoui
Samar Magdy
Abdul Waheed
Md. Tawkat Islam Khondaker
Ahmed Oumar El-Shangiti
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
14
20
0
06 Aug 2023
1
2
Next