Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2412.20127
Cited By
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
21 February 2025
Zhaopeng Feng
Jiayuan Su
Jiamei Zheng
Jiahan Ren
Yan Zhang
Jian Wu
Hongwei Wang
Zuozhu Liu
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation"
25 / 25 papers shown
Title
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
Yucheng Ning
Xixun Lin
Fang Fang
Yanan Cao
HILM
269
0
0
27 Oct 2025
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
Baban Gain
Dibyanayan Bandyopadhyay
Asif Ekbal
LM&MA
345
6
0
02 Apr 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
1.0K
247
0
25 Nov 2024
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu
Weizhe Yuan
O. Yu. Golovneva
Jing Xu
Yuandong Tian
Jiantao Jiao
Jason Weston
Sainbayar Sukhbaatar
ALM
KELM
LRM
294
148
0
28 Jul 2024
PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation
Christoph Leiter
Steffen Eger
201
16
0
26 Jun 2024
Can Automatic Metrics Assess High-Quality Translations?
Sweta Agrawal
António Farinhas
Ricardo Rei
André F. T. Martins
151
14
0
28 May 2024
The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation
Maja Pavlovic
Massimo Poesio
334
30
0
02 May 2024
From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation
Haofei Zhao
Yilun Liu
Shimin Tao
Weibin Meng
Yimeng Chen
Xiang Geng
Yan Yu
Min Zhang
Hao Yang
110
15
0
21 Mar 2024
LLMaAA: Making Large Language Models as Active Annotators
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ruoyu Zhang
Yanzeng Li
Yongliang Ma
Ming Zhou
Lei Zou
309
106
0
30 Oct 2023
GEMBA-MQM: Detecting Translation Quality Error Spans with GPT-4
Conference on Machine Translation (WMT), 2023
Tom Kocmi
C. Federmann
275
116
0
21 Oct 2023
xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection
Transactions of the Association for Computational Linguistics (TACL), 2023
Nuno M. Guerreiro
Ricardo Rei
Daan van Stigt
Luísa Coheur
Pierre Colombo
André F.T. Martins
357
228
0
16 Oct 2023
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Conference on Machine Translation (WMT), 2023
Patrick Fernandes
Daniel Deutsch
M. Finkelstein
Parker Riley
André F. T. Martins
Graham Neubig
Ankush Garg
J. Clark
Markus Freitag
Orhan Firat
LRM
206
90
0
14 Aug 2023
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Chi-Min Chan
Weize Chen
Yusheng Su
Jianxuan Yu
Wei Xue
Shan Zhang
Jie Fu
Zhiyuan Liu
ELM
LLMAG
ALM
245
702
0
14 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
5.5K
14,960
0
18 Jul 2023
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tian Liang
Zhiwei He
Wenxiang Jiao
Xing Wang
Rui Wang
Yujiu Yang
Zhaopeng Tu
Shuming Shi
LLMAG
LRM
332
763
0
30 May 2023
Improving Factuality and Reasoning in Language Models through Multiagent Debate
International Conference on Machine Learning (ICML), 2023
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAG
LRM
318
1,129
0
23 May 2023
Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Daniel Deutsch
George F. Foster
Markus Freitag
258
66
0
23 May 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Cheng-Han Chiang
Hung-yi Lee
ALM
LM&MA
537
825
0
03 May 2023
Large Language Models Are State-of-the-Art Evaluators of Translation Quality
European Association for Machine Translation Conferences/Workshops (EAMT), 2023
Tom Kocmi
C. Federmann
ELM
263
444
0
28 Feb 2023
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task
Conference on Machine Translation (WMT), 2022
Ricardo Rei
Marcos Vinícius Treviso
Nuno M. Guerreiro
Chrysoula Zerva
Ana C. Farinha
...
T. Glushkova
Duarte M. Alves
A. Lavie
Luísa Coheur
Marcely Zanon Boito
746
205
0
13 Sep 2022
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation
Marzena Karpinska
Nader Akoury
Mohit Iyyer
606
120
0
14 Sep 2021
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation
Transactions of the Association for Computational Linguistics (TACL), 2021
Markus Freitag
George F. Foster
David Grangier
Viresh Ratnakar
Qijun Tan
Wolfgang Macherey
295
459
0
29 Apr 2021
COMET: A Neural Framework for MT Evaluation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Ricardo Rei
Craig Alan Stewart
Ana C. Farinha
A. Lavie
420
1,331
0
18 Sep 2020
BLEURT: Learning Robust Metrics for Text Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
560
1,715
0
09 Apr 2020
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
1.5K
7,241
0
21 Apr 2019
1