DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations

18 March 2022

Papers citing "DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations"

24 / 24 papers shown

Title
Clinical Insights: A Comprehensive Review of Language Models in Medicine Nikita Neveditsin Pawan Lingras V. Mago LM&MA 52 3 0 08 Jan 2025
ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues John Mendonça Isabel Trancoso A. Lavie 34 3 0 16 Jul 2024
Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations Rena Gao Carsten Roever Jey Han Lau 20 4 0 09 Jul 2024
CausalScore: An Automatic Reference-Free Metric for Assessing Response Relevance in Open-Domain Dialogue Systems Tao Feng Lizhen Qu Xiaoxi Kang Gholamreza Haffari 21 1 0 25 Jun 2024
Confabulation: The Surprising Value of Large Language Model Hallucinations Peiqi Sui Eamon Duede Sophie Wu Richard Jean So HILM LLMAG 27 18 0 06 Jun 2024
Recent Trends in Personalized Dialogue Generation: A Review of Datasets, Methodologies, and Evaluations Yi-Pei Chen Noriki Nishida Hideki Nakayama Yuji Matsumoto LLMAG 41 10 0 28 May 2024
Evaluating Very Long-Term Conversational Memory of LLM Agents A. Maharana Dong-Ho Lee Sergey Tulyakov Mohit Bansal Francesco Barbieri Yuwei Fang LLMAG 22 66 0 27 Feb 2024
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators Chen Zhang L. F. D’Haro Yiming Chen Malu Zhang Haizhou Li ELM 19 29 0 24 Dec 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation Haoyi Qiu Kung-Hsiang Huang Jingnong Qu Nanyun Peng HILM 26 6 0 16 Nov 2023
Post Turing: Mapping the landscape of LLM Evaluation Alexey Tikhonov Ivan P. Yamshchikov ELM 46 4 0 03 Nov 2023
DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment Yukun Zhao Lingyong Yan Weiwei Sun Chong Meng Shuaiqiang Wang Zhicong Cheng Zhaochun Ren Dawei Yin ELM 14 0 0 25 Oct 2023
xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark Chen Zhang L. F. D’Haro Chengguang Tang Ke Shi Guohua Tang Haizhou Li ELM 36 9 0 13 Oct 2023
Leveraging Large Language Models for Automated Dialogue Analysis Sarah E. Finch Ellie S. Paek Jinho D. Choi LLMAG 29 20 0 12 Sep 2023
Open-Domain Text Evaluation via Contrastive Distribution Methods Sidi Lu Hongyi Liu Asli Celikyilmaz Tianlu Wang Nanyun Peng 23 0 0 20 Jun 2023
Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs A. Komma Nagesh Panyam Chandrasekarasastry Timothy Leffel Anuj Kumar Goyal A. Metallinou Spyros Matsoukas Aram Galstyan 25 3 0 06 Jun 2023
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation Junkai Zhou Liang Pang Huawei Shen Xueqi Cheng 19 8 0 18 May 2023
ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems Sarik Ghazarian Yijia Shao Rujun Han Aram Galstyan Nanyun Peng 18 7 0 12 May 2023
Approximating Online Human Evaluation of Social Chatbots with Prompting Ekaterina Svikhnushina Pearl Pu ELM 10 13 0 11 Apr 2023
GPTScore: Evaluate as You Desire Jinlan Fu See-Kiong Ng Zhengbao Jiang Pengfei Liu LM&MA ALM ELM 15 264 0 08 Feb 2023
MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation Jiazhan Feng Qingfeng Sun Can Xu Pu Zhao Yaming Yang Chongyang Tao Dongyan Zhao Qingwei Lin 24 52 0 10 Nov 2022
FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation Chen Zhang L. F. D’Haro Qiquan Zhang Thomas Friedrichs Haizhou Li 21 15 0 25 Oct 2022
EnDex: Evaluation of Dialogue Engagingness at Scale Guangxuan Xu Ruibo Liu Fabrice Harel-Canada Nischal Reddy Chandra Nanyun Peng 13 5 0 22 Oct 2022
Open-Domain Dialog Evaluation using Follow-Ups Likelihood Maxime De Bruyn Ehsan Lotfi Jeska Buhmann Walter Daelemans 29 9 0 12 Sep 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text Sebastian Gehrmann Elizabeth Clark Thibault Sellam ELM AI4CE 58 183 0 14 Feb 2022