Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1603.08023
Cited By
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
25 March 2016
Chia-Wei Liu
Ryan J. Lowe
Iulian Serban
Michael Noseworthy
Laurent Charlin
Joelle Pineau
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation"
50 / 220 papers shown
Title
Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents
Mohammad Kachuee
Hao Yuan
Young-Bum Kim
Sungjin Lee
19
25
0
21 Oct 2020
PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation
Clément Rebuffel
Laure Soulier
Geoffrey Scoutheeten
Patrick Gallinari
8
9
0
21 Oct 2020
Local Knowledge Powered Conversational Agents
Sashank Santhanam
Ming-Yu Liu
Raul Puri
M. Shoeybi
M. Patwary
Bryan Catanzaro
21
4
0
20 Oct 2020
Cue Me In: Content-Inducing Approaches to Interactive Story Generation
Faeze Brahman
Alexandru Petrusca
Snigdha Chaturvedi
LRM
16
20
0
20 Oct 2020
Reformulating Unsupervised Style Transfer as Paraphrase Generation
Kalpesh Krishna
John Wieting
Mohit Iyyer
19
237
0
12 Oct 2020
Plan ahead: Self-Supervised Text Planning for Paragraph Completion Task
Dongyeop Kang
Eduard H. Hovy
LRM
40
24
0
11 Oct 2020
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions
Bodhisattwa Prasad Majumder
Harsh Jhamtani
Taylor Berg-Kirkpatrick
Julian McAuley
22
85
0
07 Oct 2020
Regularizing Dialogue Generation by Imitating Implicit Scenarios
Shaoxiong Feng
Xuancheng Ren
Hongshen Chen
Bin Sun
Kan Li
Xu Sun
18
20
0
05 Oct 2020
MIME: MIMicking Emotions for Empathetic Response Generation
Navonil Majumder
Pengfei Hong
Shanshan Peng
Jiankun Lu
Deepanway Ghosal
Alexander Gelbukh
Rada Mihalcea
Soujanya Poria
23
200
0
04 Oct 2020
Predicting User Engagement Status for Online Evaluation of Intelligent Assistants
Rui Meng
Zhen Yue
A. Glass
13
2
0
01 Oct 2020
Pchatbot: A Large-Scale Dataset for Personalized Chatbot
Hongjin Qian
Xiaohe Li
Hanxun Zhong
Yu Guo
Yueyuan Ma
Yutao Zhu
Zhanliang Liu
Zhanliang Liu
Ji-Rong Wen
38
43
0
28 Sep 2020
Enhancing Dialogue Generation via Multi-Level Contrastive Learning
Xin Li
Piji Li
Yan Wang
Xiaojiang Liu
Wai Lam
26
5
0
19 Sep 2020
GLUCOSE: GeneraLized and COntextualized Story Explanations
N. Mostafazadeh
Aditya Kalyanpur
Lori Moon
David W. Buchanan
Lauren Berkowitz
Or Biran
Jennifer Chu-Carroll
19
121
0
16 Sep 2020
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
Jian-Yu Guan
Minlie Huang
21
69
0
16 Sep 2020
A Survey of Evaluation Metrics Used for NLG Systems
Ananya B. Sai
Akash Kumar Mohankumar
Mitesh M. Khapra
ELM
30
228
0
27 Aug 2020
Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce
Yang Deng
Wenxuan Zhanng
Wai Lam
16
31
0
27 Aug 2020
CoreGen: Contextualized Code Representation Learning for Commit Message Generation
L. Nie
Cuiyun Gao
Zhicong Zhong
Wai Lam
Yang Liu
Zenglin Xu
21
46
0
14 Jul 2020
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
19
376
0
26 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Stephen Roller
Y-Lan Boureau
Jason Weston
Antoine Bordes
Emily Dinan
...
Kurt Shuster
Eric Michael Smith
Arthur Szlam
Jack Urbanek
Mary Williamson
LLMAG
AI4CE
22
51
0
22 Jun 2020
Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols
Sarah E. Finch
Jinho D. Choi
ELM
23
67
0
10 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
36
9
0
10 Jun 2020
Probing Neural Dialog Models for Conversational Understanding
Abdelrhman Saleh
Tovly Deutsch
Stephen Casper
Yonatan Belinkov
Stuart M. Shieber
21
13
0
07 Jun 2020
Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation
Weixin Liang
James Zou
Zhou Yu
ELM
34
33
0
21 May 2020
SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling
F. S. Bao
Hebi Li
Ge Luo
Minghui Qiu
Yinfei Yang
Youbiao He
Cen Chen
16
4
0
13 May 2020
Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation
Zhiliang Tian
Wei Bi
Dongkyu Lee
Lanqing Xue
Yiping Song
Xiaojiang Liu
N. Zhang
27
25
0
13 May 2020
History for Visual Dialog: Do we really need it?
Shubham Agarwal
Trung Bui
Joon-Young Lee
Ioannis Konstas
Verena Rieser
VLM
13
69
0
08 May 2020
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization
Esin Durmus
He He
Mona T. Diab
HILM
6
384
0
07 May 2020
Learning an Unreferenced Metric for Online Dialogue Evaluation
Koustuv Sinha
Prasanna Parthasarathi
Jasmine Wang
Ryan J. Lowe
William L. Hamilton
Joelle Pineau
OffRL
21
84
0
01 May 2020
KPQA: A Metric for Generative Question Answering Using Keyphrase Weights
Hwanhee Lee
Seunghyun Yoon
Franck Dernoncourt
Doo Soon Kim
Trung Bui
Joongbo Shin
Kyomin Jung
16
0
0
01 May 2020
Question Rewriting for Conversational Question Answering
Svitlana Vakulenko
Shayne Longpre
Zhucheng Tu
R. Anantha
20
172
0
30 Apr 2020
Learning to Update Natural Language Comments Based on Code Changes
Sheena Panthaplackel
Pengyu Nie
Miloš Gligorić
Junyi Jessy Li
Raymond J. Mooney
27
63
0
25 Apr 2020
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
19
350
0
21 Apr 2020
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
46
1,442
0
09 Apr 2020
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
Alex Jinpeng Wang
Kyunghyun Cho
M. Lewis
HILM
10
470
0
08 Apr 2020
A Survey on Conversational Recommender Systems
Dietmar Jannach
A. Manzoor
Wanling Cai
Li Chen
13
403
0
01 Apr 2020
XPersona: Evaluating Multilingual Personalized Chatbot
Zhaojiang Lin
Zihan Liu
Genta Indra Winata
Samuel Cahyawijaya
Andrea Madotto
Yejin Bang
Etsuko Ishii
Pascale Fung
45
57
0
17 Mar 2020
Posterior-GAN: Towards Informative and Coherent Response Generation with Posterior Generative Adversarial Network
Shaoxiong Feng
Hongshen Chen
Kan Li
Dawei Yin
GAN
49
25
0
04 Mar 2020
A Neural Topical Expansion Framework for Unstructured Persona-oriented Dialogue Generation
Minghong Xu
Piji Li
Haoran Yang
Pengjie Ren
Z. Ren
Zhumin Chen
Jun Ma
18
31
0
06 Feb 2020
Towards a Human-like Open-Domain Chatbot
Daniel De Freitas
Minh-Thang Luong
David R. So
Jamie Hall
Noah Fiedel
...
Zi Yang
Apoorv Kulshreshtha
Gaurav Nemade
Yifeng Lu
Quoc V. Le
30
923
0
27 Jan 2020
Paraphrase Generation with Latent Bag of Words
Yao Fu
Yansong Feng
John P. Cunningham
BDL
25
91
0
07 Jan 2020
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
19
9
0
19 Dec 2019
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
Sumanth Dathathri
Andrea Madotto
Janice Lan
Jane Hung
Eric Frank
Piero Molino
J. Yosinski
Rosanne Liu
KELM
26
937
0
04 Dec 2019
Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context
Yichi Zhang
Zhijian Ou
Zhou Yu
19
182
0
24 Nov 2019
Social Bias Frames: Reasoning about Social and Power Implications of Language
Maarten Sap
Saadia Gabriel
Lianhui Qin
Dan Jurafsky
Noah A. Smith
Yejin Choi
28
483
0
10 Nov 2019
Automatic Reminiscence Therapy for Dementia
Mariona Carós
M. Garolera
P. Radeva
Xavier Giró-i-Nieto
21
40
0
25 Oct 2019
Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models
Tianxing He
Jun Liu
Kyunghyun Cho
Myle Ott
Bing-Quan Liu
James R. Glass
Fuchun Peng
CLL
29
9
0
16 Oct 2019
Learning from Fact-checkers: Analysis and Generation of Fact-checking Language
Nguyen Vo
Kyumin Lee
9
68
0
05 Oct 2019
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs
Yi-Lin Tuan
Yun-Nung (Vivian) Chen
Hung-yi Lee
18
71
0
01 Oct 2019
Do Massively Pretrained Language Models Make Better Storytellers?
A. See
Aneesh S. Pappu
Rohun Saxena
Akhila Yerukola
Christopher D. Manning
37
166
0
24 Sep 2019
Counterfactual Story Reasoning and Generation
Lianhui Qin
Antoine Bosselut
Ari Holtzman
Chandra Bhagavatula
Elizabeth Clark
Yejin Choi
LRM
11
140
0
09 Sep 2019
Previous
1
2
3
4
5
Next