v1v2v3 (latest)

Automatic Evaluation and Moderation of Open-domain Dialogue Systems

3 November 2021

Chen Zhang

João Sedoc

L. F. D’Haro

Rafael E. Banchs

Alexander I. Rudnicky

ArXiv (abs)PDF HTML Github

Papers citing "Automatic Evaluation and Moderation of Open-domain Dialogue Systems"

26 / 26 papers shown

Toxicity in Online Platforms and AI Systems: A Survey of Needs, Challenges, Mitigations, and Future DirectionsExpert systems with applications (ESWA), 2025

216

29 Sep 2025

Overview of Dialog System Evaluation Track: Dimensionality, Language, Culture and Safety at DSTC 12

192

16 Sep 2025

MEDAL: A Framework for Benchmarking LLMs as Multilingual Open-Domain Dialogue Evaluators

John Mendonça

A. Lavie

Isabel Trancoso

565

28 May 2025

Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

John Mendonça

Isabel Trancoso

A. Lavie

ALM

291

20 Aug 2024

On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation

204

04 Jul 2024

Themis: Towards Flexible and Interpretable NLG Evaluation

Xiaojun Wan

340

26 Jun 2024

An Analysis of User Behaviors for Objectively Evaluating Spoken Dialogue Systems

247

10 Jan 2024

The DSA Transparency Database: Auditing Self-reported Moderation Actions by Social Media

Amaury Trujillo

T. Fagni

S. Cresci

315

16 Dec 2023

Dialogue Quality and Emotion Annotations for Customer Support ConversationsIEEE Games Entertainment Media Conference (IEEE GEM), 2023

204

23 Nov 2023

xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation BenchmarkConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Chen Zhang

Haizhou Li

247

13 Oct 2023

Towards Multilingual Automatic Dialogue EvaluationSIGDIAL Conferences (SIGDIAL), 2023

John Mendonça

A. Lavie

Isabel Trancoso

206

31 Aug 2023

Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors

179

21 Aug 2023

Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4

Mario Rodríguez-Cantelar

Chen Zhang

Alexander I. Rudnicky

291

22 Jun 2023

Psychological Metrics for Dialog System Evaluation

Joao Sedoc

464

24 May 2023

Evaluate What You Can't Evaluate: Unassessable Quality for Generated Response

Shi Feng

287

24 May 2023

How to Choose How to Choose Your Chatbot: A Massively Multi-System MultiReference Data Set for Dialog Metric Evaluation

252

23 May 2023

LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models

Yen-Ting Lin

Yun-Nung Chen

253

126

23 May 2023

Complex QA and language models hybrid architectures, Survey

846

17 Feb 2023

Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

261

27 Jan 2023

PoE: a Panel of Experts for Generalized Automatic Dialogue AssessmentIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Chen Zhang

L. F. D’Haro

Qiquan Zhang

Thomas Friedrichs

Haizhou Li

212

18 Dec 2022

FineD-Eval: Fine-grained Automatic Dialogue-Level EvaluationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Chen Zhang

L. F. D’Haro

Qiquan Zhang

Thomas Friedrichs

Haizhou Li

242

25 Oct 2022

MME-CRS: Multi-Metric Evaluation Based on Correlation Re-Scaling for Evaluating Open-Domain Dialogue

169

19 Jun 2022

Relevance in Dialogue: Is Less More? An Empirical Comparison of Existing Metrics, and a Novel Simple Metric

Ian Berlot-Attwell

Frank Rudzicz

233

03 Jun 2022

InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction TuningConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

411

25 May 2022

Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges

...

Chen Zhang

273

18 Mar 2022

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

Chen Zhang

L. F. D’Haro

Thomas Friedrichs

Haizhou Li

ELM

283

14 Dec 2021