Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.09711
Cited By
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations
18 March 2022
Sarik Ghazarian
Nuan Wen
Aram Galstyan
Nanyun Peng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations"
24 / 24 papers shown
Title
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
52
3
0
08 Jan 2025
ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues
John Mendonça
Isabel Trancoso
A. Lavie
34
3
0
16 Jul 2024
Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations
Rena Gao
Carsten Roever
Jey Han Lau
20
4
0
09 Jul 2024
CausalScore: An Automatic Reference-Free Metric for Assessing Response Relevance in Open-Domain Dialogue Systems
Tao Feng
Lizhen Qu
Xiaoxi Kang
Gholamreza Haffari
21
1
0
25 Jun 2024
Confabulation: The Surprising Value of Large Language Model Hallucinations
Peiqi Sui
Eamon Duede
Sophie Wu
Richard Jean So
HILM
LLMAG
27
18
0
06 Jun 2024
Recent Trends in Personalized Dialogue Generation: A Review of Datasets, Methodologies, and Evaluations
Yi-Pei Chen
Noriki Nishida
Hideki Nakayama
Yuji Matsumoto
LLMAG
41
10
0
28 May 2024
Evaluating Very Long-Term Conversational Memory of LLM Agents
A. Maharana
Dong-Ho Lee
Sergey Tulyakov
Mohit Bansal
Francesco Barbieri
Yuwei Fang
LLMAG
22
66
0
27 Feb 2024
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators
Chen Zhang
L. F. D’Haro
Yiming Chen
Malu Zhang
Haizhou Li
ELM
19
29
0
24 Dec 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
26
6
0
16 Nov 2023
Post Turing: Mapping the landscape of LLM Evaluation
Alexey Tikhonov
Ivan P. Yamshchikov
ELM
46
4
0
03 Nov 2023
DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment
Yukun Zhao
Lingyong Yan
Weiwei Sun
Chong Meng
Shuaiqiang Wang
Zhicong Cheng
Zhaochun Ren
Dawei Yin
ELM
14
0
0
25 Oct 2023
xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark
Chen Zhang
L. F. D’Haro
Chengguang Tang
Ke Shi
Guohua Tang
Haizhou Li
ELM
36
9
0
13 Oct 2023
Leveraging Large Language Models for Automated Dialogue Analysis
Sarah E. Finch
Ellie S. Paek
Jinho D. Choi
LLMAG
29
20
0
12 Sep 2023
Open-Domain Text Evaluation via Contrastive Distribution Methods
Sidi Lu
Hongyi Liu
Asli Celikyilmaz
Tianlu Wang
Nanyun Peng
23
0
0
20 Jun 2023
Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs
A. Komma
Nagesh Panyam Chandrasekarasastry
Timothy Leffel
Anuj Kumar Goyal
A. Metallinou
Spyros Matsoukas
Aram Galstyan
25
3
0
06 Jun 2023
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation
Junkai Zhou
Liang Pang
Huawei Shen
Xueqi Cheng
19
8
0
18 May 2023
ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems
Sarik Ghazarian
Yijia Shao
Rujun Han
Aram Galstyan
Nanyun Peng
18
7
0
12 May 2023
Approximating Online Human Evaluation of Social Chatbots with Prompting
Ekaterina Svikhnushina
Pearl Pu
ELM
10
13
0
11 Apr 2023
GPTScore: Evaluate as You Desire
Jinlan Fu
See-Kiong Ng
Zhengbao Jiang
Pengfei Liu
LM&MA
ALM
ELM
15
264
0
08 Feb 2023
MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation
Jiazhan Feng
Qingfeng Sun
Can Xu
Pu Zhao
Yaming Yang
Chongyang Tao
Dongyan Zhao
Qingwei Lin
24
52
0
10 Nov 2022
FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
21
15
0
25 Oct 2022
EnDex: Evaluation of Dialogue Engagingness at Scale
Guangxuan Xu
Ruibo Liu
Fabrice Harel-Canada
Nischal Reddy Chandra
Nanyun Peng
13
5
0
22 Oct 2022
Open-Domain Dialog Evaluation using Follow-Ups Likelihood
Maxime De Bruyn
Ehsan Lotfi
Jeska Buhmann
Walter Daelemans
29
9
0
12 Sep 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text
Sebastian Gehrmann
Elizabeth Clark
Thibault Sellam
ELM
AI4CE
58
183
0
14 Feb 2022
1