ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.20127
  4. Cited By
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation

M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation

Annual Meeting of the Association for Computational Linguistics (ACL), 2024
21 February 2025
Zhaopeng Feng
Jiayuan Su
Jiamei Zheng
Jiahan Ren
Yan Zhang
Jian Wu
Hongwei Wang
Zuozhu Liu
    ELM
ArXiv (abs)PDFHTML

Papers citing "M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation"

25 / 25 papers shown
Title
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
Yucheng Ning
Xixun Lin
Fang Fang
Yanan Cao
HILM
269
0
0
27 Oct 2025
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
Baban Gain
Dibyanayan Bandyopadhyay
Asif Ekbal
LM&MA
345
6
0
02 Apr 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELMAILaw
1.0K
247
0
25 Nov 2024
Meta-Rewarding Language Models: Self-Improving Alignment with
  LLM-as-a-Meta-Judge
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu
Weizhe Yuan
O. Yu. Golovneva
Jing Xu
Yuandong Tian
Jiantao Jiao
Jason Weston
Sainbayar Sukhbaatar
ALMKELMLRM
294
148
0
28 Jul 2024
PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine
  Translation and Summarization Evaluation
PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation
Christoph Leiter
Steffen Eger
201
16
0
26 Jun 2024
Can Automatic Metrics Assess High-Quality Translations?
Can Automatic Metrics Assess High-Quality Translations?
Sweta Agrawal
António Farinhas
Ricardo Rei
André F. T. Martins
151
14
0
28 May 2024
The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation
The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation
Maja Pavlovic
Massimo Poesio
334
30
0
02 May 2024
From Handcrafted Features to LLMs: A Brief Survey for Machine
  Translation Quality Estimation
From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation
Haofei Zhao
Yilun Liu
Shimin Tao
Weibin Meng
Yimeng Chen
Xiang Geng
Yan Yu
Min Zhang
Hao Yang
110
15
0
21 Mar 2024
LLMaAA: Making Large Language Models as Active Annotators
LLMaAA: Making Large Language Models as Active AnnotatorsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ruoyu Zhang
Yanzeng Li
Yongliang Ma
Ming Zhou
Lei Zou
309
106
0
30 Oct 2023
GEMBA-MQM: Detecting Translation Quality Error Spans with GPT-4
GEMBA-MQM: Detecting Translation Quality Error Spans with GPT-4Conference on Machine Translation (WMT), 2023
Tom Kocmi
C. Federmann
275
116
0
21 Oct 2023
xCOMET: Transparent Machine Translation Evaluation through Fine-grained
  Error Detection
xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error DetectionTransactions of the Association for Computational Linguistics (TACL), 2023
Nuno M. Guerreiro
Ricardo Rei
Daan van Stigt
Luísa Coheur
Pierre Colombo
André F.T. Martins
357
228
0
16 Oct 2023
The Devil is in the Errors: Leveraging Large Language Models for
  Fine-grained Machine Translation Evaluation
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation EvaluationConference on Machine Translation (WMT), 2023
Patrick Fernandes
Daniel Deutsch
M. Finkelstein
Parker Riley
André F. T. Martins
Graham Neubig
Ankush Garg
J. Clark
Markus Freitag
Orhan Firat
LRM
206
90
0
14 Aug 2023
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Chi-Min Chan
Weize Chen
Yusheng Su
Jianxuan Yu
Wei Xue
Shan Zhang
Jie Fu
Zhiyuan Liu
ELMLLMAGALM
245
702
0
14 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
5.5K
14,960
0
18 Jul 2023
Encouraging Divergent Thinking in Large Language Models through
  Multi-Agent Debate
Encouraging Divergent Thinking in Large Language Models through Multi-Agent DebateConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tian Liang
Zhiwei He
Wenxiang Jiao
Xing Wang
Rui Wang
Yujiu Yang
Zhaopeng Tu
Shuming Shi
LLMAGLRM
332
763
0
30 May 2023
Improving Factuality and Reasoning in Language Models through Multiagent
  Debate
Improving Factuality and Reasoning in Language Models through Multiagent DebateInternational Conference on Machine Learning (ICML), 2023
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAGLRM
318
1,129
0
23 May 2023
Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and
  Tie Calibration
Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie CalibrationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Daniel Deutsch
George F. Foster
Markus Freitag
258
66
0
23 May 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Can Large Language Models Be an Alternative to Human Evaluations?Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Cheng-Han Chiang
Hung-yi Lee
ALMLM&MA
537
825
0
03 May 2023
Large Language Models Are State-of-the-Art Evaluators of Translation
  Quality
Large Language Models Are State-of-the-Art Evaluators of Translation QualityEuropean Association for Machine Translation Conferences/Workshops (EAMT), 2023
Tom Kocmi
C. Federmann
ELM
263
444
0
28 Feb 2023
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared
  Task
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared TaskConference on Machine Translation (WMT), 2022
Ricardo Rei
Marcos Vinícius Treviso
Nuno M. Guerreiro
Chrysoula Zerva
Ana C. Farinha
...
T. Glushkova
Duarte M. Alves
A. Lavie
Luísa Coheur
Marcely Zanon Boito
746
205
0
13 Sep 2022
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text
  Generation
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation
Marzena Karpinska
Nader Akoury
Mohit Iyyer
606
120
0
14 Sep 2021
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation
  for Machine Translation
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine TranslationTransactions of the Association for Computational Linguistics (TACL), 2021
Markus Freitag
George F. Foster
David Grangier
Viresh Ratnakar
Qijun Tan
Wolfgang Macherey
295
459
0
29 Apr 2021
COMET: A Neural Framework for MT Evaluation
COMET: A Neural Framework for MT EvaluationConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Ricardo Rei
Craig Alan Stewart
Ana C. Farinha
A. Lavie
420
1,331
0
18 Sep 2020
BLEURT: Learning Robust Metrics for Text Generation
BLEURT: Learning Robust Metrics for Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
560
1,715
0
09 Apr 2020
BERTScore: Evaluating Text Generation with BERT
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
1.5K
7,241
0
21 Apr 2019
1