ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.14520
  4. Cited By
Large Language Models Are State-of-the-Art Evaluators of Translation
  Quality

Large Language Models Are State-of-the-Art Evaluators of Translation Quality

28 February 2023
Tom Kocmi
C. Federmann
    ELM
ArXivPDFHTML

Papers citing "Large Language Models Are State-of-the-Art Evaluators of Translation Quality"

41 / 41 papers shown
Title
Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models
Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models
Tobias Domhan
Dawei Zhu
24
0
0
03 May 2025
LLM Sensitivity Evaluation Framework for Clinical Diagnosis
LLM Sensitivity Evaluation Framework for Clinical Diagnosis
Chenwei Yan
Xiangling Fu
Yuxuan Xiong
Tianyi Wang
Siu Cheung Hui
Ji Wu
Xien Liu
LM&MA
ELM
32
0
0
18 Apr 2025
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs
Jiliang Ni
Jiachen Pu
Zhongyi Yang
Kun Zhou
Hui Wang
Xiaoliang Xiao
Dakui Wang
Xin Li
Jingfeng Luo
Conggang Hu
32
0
0
18 Apr 2025
LLM-as-a-Judge: Reassessing the Performance of LLMs in Extractive QA
LLM-as-a-Judge: Reassessing the Performance of LLMs in Extractive QA
Xanh Ho
Jiahao Huang
Florian Boudin
Akiko Aizawa
ELM
29
0
0
16 Apr 2025
Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer Performance
Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer Performance
Nirvan Patil
Malhar Abhay Inamdar
Agnivo Gosai
Guruprasad Pathak
Anish Joshi
Aryan Sagavekar
Anish Joshirao
Raj Abhijit Dandekar
Rajat Dandekar
Sreedath Panat
33
0
0
07 Apr 2025
Is Your Video Language Model a Reliable Judge?
M. Liu
Wensheng Zhang
56
1
0
07 Mar 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
SeongYeub Chu
JongWoo Kim
MunYong Yi
53
1
0
21 Feb 2025
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
Zhaopeng Feng
Jiayuan Su
Jiamei Zheng
Jiahan Ren
Yan Zhang
Jian Wu
Hongwei Wang
Zuozhu Liu
ELM
198
0
0
21 Feb 2025
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
Xu Huang
Wenhao Zhu
Hanxu Hu
Conghui He
Lei Li
Shujian Huang
Fei Yuan
ELM
49
3
0
11 Feb 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
28
1
0
28 Jan 2025
Speech Translation Refinement using Large Language Models
Huaixia Dou
Xinyu Tian
Xinglin Lyu
Jie Zhu
Junhui Li
Lifan Guo
54
0
0
28 Jan 2025
Personalizing Education through an Adaptive LMS with Integrated LLMs
Kyle Spriggs
Meng Cheng Lau
Kalpdrum Passi
AI4Ed
48
0
0
24 Jan 2025
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Yinhong Liu
Han Zhou
Zhijiang Guo
Ehsan Shareghi
Ivan Vulić
Anna Korhonen
Nigel Collier
ALM
128
64
0
20 Jan 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
108
61
0
25 Nov 2024
UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts
UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts
Bo Yang
Qingping Yang
Runtao Liu
Runtao Liu
LRM
ReLM
ELM
AIMat
62
1
0
11 Nov 2024
HEALTH-PARIKSHA: Assessing RAG Models for Health Chatbots in Real-World
  Multilingual Settings
HEALTH-PARIKSHA: Assessing RAG Models for Health Chatbots in Real-World Multilingual Settings
Varun Gumma
Anandhita Raghunath
Mohit Jain
Sunayana Sitaram
LM&MA
32
1
0
17 Oct 2024
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback
Zonghai Yao
Aditya Parashar
Huixue Zhou
Won Seok Jang
Feiyun Ouyang
Zhichao Yang
Hong-ye Yu
ELM
37
2
0
17 Oct 2024
Data Processing for the OpenGPT-X Model Family
Data Processing for the OpenGPT-X Model Family
Nicolo' Brandizzi
Hammam Abdelwahab
Anirban Bhowmick
Lennard Helmer
Benny Jörg Stein
...
Georg Rehm
Dennis Wegener
Nicolas Flores-Herr
Joachim Kohler
Johannes Leveling
VLM
79
2
0
11 Oct 2024
What do Large Language Models Need for Machine Translation Evaluation?
What do Large Language Models Need for Machine Translation Evaluation?
Shenbin Qian
Archchana Sindhujan
Minnie Kabra
Diptesh Kanojia
Constantin Orasan
Tharindu Ranasinghe
Frédéric Blain
ELM
LRM
ALM
LM&MA
18
0
0
04 Oct 2024
A Survey on Failure Analysis and Fault Injection in AI Systems
A Survey on Failure Analysis and Fault Injection in AI Systems
Guangba Yu
Gou Tan
Haojia Huang
Zhenyu Zhang
Pengfei Chen
Roberto Natella
Zibin Zheng
29
3
0
28 Jun 2024
Large Language Models as Evaluators for Recommendation Explanations
Large Language Models as Evaluators for Recommendation Explanations
Xiaoyu Zhang
Yishan Li
Jiayin Wang
Bowen Sun
Weizhi Ma
Peijie Sun
Min Zhang
LRM
ELM
29
12
0
05 Jun 2024
SLIDE: A Framework Integrating Small and Large Language Models for
  Open-Domain Dialogues Evaluation
SLIDE: A Framework Integrating Small and Large Language Models for Open-Domain Dialogues Evaluation
Kun Zhao
Bohao Yang
Chen Tang
Chenghua Lin
Liang Zhan
28
5
0
24 May 2024
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer
  Selection in Large Language Models
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Zhangyue Yin
Qiushi Sun
Qipeng Guo
Zhiyuan Zeng
Xiaonan Li
...
Qinyuan Cheng
Ding Wang
Xiaofeng Mou
Xipeng Qiu
XuanJing Huang
LRM
41
3
0
21 May 2024
What Have We Achieved on Non-autoregressive Translation?
What Have We Achieved on Non-autoregressive Translation?
Yafu Li
Huajian Zhang
Jianhao Yan
Yongjing Yin
Yue Zhang
23
0
0
21 May 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
54
17
0
28 Feb 2024
The Lay Person's Guide to Biomedicine: Orchestrating Large Language
  Models
The Lay Person's Guide to Biomedicine: Orchestrating Large Language Models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
33
0
0
21 Feb 2024
Gradable ChatGPT Translation Evaluation
Gradable ChatGPT Translation Evaluation
Hui Jiao
Bei Peng
Lu Zong
Xiaojun Zhang
Xinwei Li
28
1
0
18 Jan 2024
Convergences and Divergences between Automatic Assessment and Human
  Evaluation: Insights from Comparing ChatGPT-Generated Translation and Neural
  Machine Translation
Convergences and Divergences between Automatic Assessment and Human Evaluation: Insights from Comparing ChatGPT-Generated Translation and Neural Machine Translation
Zhaokun Jiang
Ziyin Zhang
EGVM
19
3
0
10 Jan 2024
Distinguishing Translations by Human, NMT, and ChatGPT: A Linguistic and
  Statistical Approach
Distinguishing Translations by Human, NMT, and ChatGPT: A Linguistic and Statistical Approach
Zhaokun Jiang
Qianxi Lv
Ziyin Zhang
8
1
0
17 Dec 2023
Word Definitions from Large Language Models
Word Definitions from Large Language Models
Yunting Yin
Steven Skiena
Samuel Kim
Yunting Yin
Steven Skiena
AILaw
30
0
0
10 Nov 2023
Frustrated with Code Quality Issues? LLMs can Help!
Frustrated with Code Quality Issues? LLMs can Help!
Nalin Wadhwa
Jui Pradhan
Atharv Sonwane
Surya Prakash Sahu
Nagarajan Natarajan
Aditya Kanade
Suresh Parthasarathy
S. Rajamani
25
2
0
22 Sep 2023
Automatic Answerability Evaluation for Question Generation
Automatic Answerability Evaluation for Question Generation
Zifan Wang
Kotaro Funakoshi
Manabu Okumura
11
2
0
22 Sep 2023
Towards Effective Disambiguation for Machine Translation with Large
  Language Models
Towards Effective Disambiguation for Machine Translation with Large Language Models
Vivek Iyer
Pinzhen Chen
Alexandra Birch
9
10
0
20 Sep 2023
Three Ways of Using Large Language Models to Evaluate Chat
Three Ways of Using Large Language Models to Evaluate Chat
Ondvrej Plátek
Vojtvech Hudevcek
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
ALM
13
5
0
12 Aug 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
34
3
0
08 Aug 2023
Large language models effectively leverage document-level context for
  literary translation, but critical errors persist
Large language models effectively leverage document-level context for literary translation, but critical errors persist
Marzena Karpinska
Mohit Iyyer
31
81
0
06 Apr 2023
Hallucinations in Large Multilingual Translation Models
Hallucinations in Large Multilingual Translation Models
Nuno M. Guerreiro
Duarte M. Alves
Jonas Waldendorf
Barry Haddow
Alexandra Birch
Pierre Colombo
André F.T. Martins
VLM
HILM
LRM
13
139
0
28 Mar 2023
Error Analysis Prompting Enables Human-Like Translation Evaluation in
  Large Language Models
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models
Qingyu Lu
Baopu Qiu
Liang Ding
Liping Xie
Tom Kocmi
Dacheng Tao
LRM
ALM
ELM
19
102
0
24 Mar 2023
Embarrassingly Easy Document-Level MT Metrics: How to Convert Any
  Pretrained Metric Into a Document-Level Metric
Embarrassingly Easy Document-Level MT Metrics: How to Convert Any Pretrained Metric Into a Document-Level Metric
Giorgos Vernikos
Brian Thompson
Prashant Mathur
Marcello Federico
34
40
0
27 Sep 2022
Towards Automated Document Revision: Grammatical Error Correction,
  Fluency Edits, and Beyond
Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond
Masato Mita
Keisuke Sakaguchi
Masato Hagiwara
Tomoya Mizumoto
Jun Suzuki
Kentaro Inui
39
13
0
23 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1