ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.03706
  4. Cited By
A Comprehensive Assessment of Dialog Evaluation Metrics

A Comprehensive Assessment of Dialog Evaluation Metrics

7 June 2021
Yi-Ting Yeh
M. Eskénazi
Shikib Mehri
ArXivPDFHTML

Papers citing "A Comprehensive Assessment of Dialog Evaluation Metrics"

31 / 81 papers shown
Title
Distribution Aware Metrics for Conditional Natural Language Generation
Distribution Aware Metrics for Conditional Natural Language Generation
David M. Chan
Yiming Ni
David A. Ross
Sudheendra Vijayanarasimhan
Austin Myers
John F. Canny
35
4
0
15 Sep 2022
Open-Domain Dialog Evaluation using Follow-Ups Likelihood
Open-Domain Dialog Evaluation using Follow-Ups Likelihood
Maxime De Bruyn
Ehsan Lotfi
Jeska Buhmann
Walter Daelemans
26
9
0
12 Sep 2022
SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation
SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation
Longxuan Ma
Ziyu Zhuang
Weinan Zhang
Mingda Li
Ting Liu
19
4
0
17 Aug 2022
Relevance in Dialogue: Is Less More? An Empirical Comparison of Existing
  Metrics, and a Novel Simple Metric
Relevance in Dialogue: Is Less More? An Empirical Comparison of Existing Metrics, and a Novel Simple Metric
Ian Berlot-Attwell
Frank Rudzicz
15
1
0
03 Jun 2022
InstructDial: Improving Zero and Few-shot Generalization in Dialogue
  through Instruction Tuning
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
Prakhar Gupta
Cathy Jiao
Yi-Ting Yeh
Shikib Mehri
M. Eskénazi
Jeffrey P. Bigham
ALM
36
47
0
25 May 2022
What should I Ask: A Knowledge-driven Approach for Follow-up Questions
  Generation in Conversational Surveys
What should I Ask: A Knowledge-driven Approach for Follow-up Questions Generation in Conversational Surveys
Yubin Ge
Ziang Xiao
Jana Diesner
Heng Ji
Karrie Karahalios
Hari Sundaram
43
12
0
23 May 2022
CORAL: Contextual Response Retrievability Loss Function for Training
  Dialog Generation Models
CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models
Bishal Santra
Ravi Ghadia
Manish Gupta
Pawan Goyal
OffRL
15
0
0
21 May 2022
Empathetic Conversational Systems: A Review of Current Advances, Gaps,
  and Opportunities
Empathetic Conversational Systems: A Review of Current Advances, Gaps, and Opportunities
Aravind Sesagiri Raamkumar
Yinping Yang
12
28
0
09 May 2022
Spurious Correlations in Reference-Free Evaluation of Text Generation
Spurious Correlations in Reference-Free Evaluation of Text Generation
Esin Durmus
Faisal Ladhak
Tatsunori Hashimoto
14
30
0
21 Apr 2022
TRUE: Re-evaluating Factual Consistency Evaluation
TRUE: Re-evaluating Factual Consistency Evaluation
Or Honovich
Roee Aharoni
Jonathan Herzig
Hagai Taitelbaum
Doron Kukliansy
Vered Cohen
Thomas Scialom
Idan Szpektor
Avinatan Hassidim
Yossi Matias
HILM
27
3
0
11 Apr 2022
Quality Assurance of Generative Dialog Models in an Evolving
  Conversational Agent Used for Swedish Language Practice
Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice
Markus Borg
J. Bengtsson
Harald Österling
Alexander Hagelborn
Isabella Gagner
Piotr Tomaszewski
4
1
0
29 Mar 2022
What is wrong with you?: Leveraging User Sentiment for Automatic Dialog
  Evaluation
What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation
Sarik Ghazarian
Behnam Hedayatnia
Alexandros Papangelis
Yang Liu
Dilek Z. Hakkani-Tür
22
19
0
25 Mar 2022
Report from the NSF Future Directions Workshop on Automatic Evaluation
  of Dialog: Research Directions and Challenges
Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges
Shikib Mehri
Jinho Choi
L. F. D’Haro
Jan Deriu
M. Eskénazi
...
David Traum
Yi-Ting Yeh
Zhou Yu
Yizhe Zhang
Chen Zhang
28
21
0
18 Mar 2022
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic
  Manipulations
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations
Sarik Ghazarian
Nuan Wen
Aram Galstyan
Nanyun Peng
11
34
0
18 Mar 2022
Probing the Robustness of Trained Metrics for Conversational Dialogue
  Systems
Probing the Robustness of Trained Metrics for Conversational Dialogue Systems
Jan Deriu
Don Tuggener
Pius von Daniken
Mark Cieliebak
AAML
8
9
0
28 Feb 2022
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment
  Act Flows
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows
Jianqiao Zhao
Yanyang Li
Wanyu Du
Yangfeng Ji
Dong Yu
M. Lyu
Liwei Wang
12
4
0
14 Feb 2022
Mental Health Assessment for the Chatbots
Mental Health Assessment for the Chatbots
Yong Shan
Jinchao Zhang
Zekang Li
Yang Feng
Jie Zhou
AI4MH
11
3
0
14 Jan 2022
Human Evaluation of Conversations is an Open Problem: comparing the
  sensitivity of various methods for evaluating dialogue agents
Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents
Eric Michael Smith
Orion Hsu
Rebecca Qian
Stephen Roller
Y-Lan Boureau
Jason Weston
19
66
0
12 Jan 2022
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue
  Evaluation
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
Chen Zhang
L. F. D’Haro
Thomas Friedrichs
Haizhou Li
ELM
9
18
0
14 Dec 2021
User Response and Sentiment Prediction for Automatic Dialogue Evaluation
User Response and Sentiment Prediction for Automatic Dialogue Evaluation
Sarik Ghazarian
Behnam Hedayatnia
Alexandros Papangelis
Yang Liu
Dilek Z. Hakkani-Tür
9
3
0
16 Nov 2021
Automatic Evaluation and Moderation of Open-domain Dialogue Systems
Automatic Evaluation and Moderation of Open-domain Dialogue Systems
Chen Zhang
João Sedoc
L. F. D’Haro
Rafael E. Banchs
Alexander I. Rudnicky
20
36
0
03 Nov 2021
Modeling Performance in Open-Domain Dialogue with PARADISE
Modeling Performance in Open-Domain Dialogue with PARADISE
M. Walker
Colin Harmon
James Graupera
Davan Harrison
S. Whittaker
12
7
0
21 Oct 2021
Think Before You Speak: Explicitly Generating Implicit Commonsense
  Knowledge for Response Generation
Think Before You Speak: Explicitly Generating Implicit Commonsense Knowledge for Response Generation
Pei Zhou
Karthik Gopalakrishnan
Behnam Hedayatnia
Seokhwan Kim
Jay Pujara
Xiang Ren
Yang Liu
Dilek Z. Hakkani-Tür
34
40
0
16 Oct 2021
Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation
  for Open-Domain Dialogue
Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation for Open-Domain Dialogue
Lena Reed
Cecilia Li
Angela Ramirez
Liren Wu
M. Walker
21
7
0
15 Oct 2021
Exploring Dense Retrieval for Dialogue Response Selection
Exploring Dense Retrieval for Dialogue Response Selection
Tian Lan
Deng Cai
Yan Wang
Yixuan Su
Heyan Huang
Xian-Ling Mao
109
16
0
13 Oct 2021
Investigating the Impact of Pre-trained Language Models on Dialog
  Evaluation
Investigating the Impact of Pre-trained Language Models on Dialog Evaluation
Chen Zhang
L. F. D’Haro
Yiming Chen
Thomas Friedrichs
Haizhou Li
13
5
0
05 Oct 2021
Improving Stack Overflow question title generation with copying enhanced
  CodeBERT model and bi-modal information
Improving Stack Overflow question title generation with copying enhanced CodeBERT model and bi-modal information
Fengji Zhang
Xiao Yu
J. Keung
Fuyang Li
Zhiwen Xie
Zhen Yang
Caoyuan Ma
Zhimin Zhang
35
25
0
27 Sep 2021
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Nouha Dziri
Hannah Rashkin
Tal Linzen
David Reitter
ALM
185
79
0
30 Apr 2021
Dialogue-adaptive Language Model Pre-training From Quality Estimation
Dialogue-adaptive Language Model Pre-training From Quality Estimation
Junlong Li
Zhuosheng Zhang
Hai Zhao
OffRL
19
12
0
10 Sep 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,244
0
16 Jan 2013
Previous
12