How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

25 March 2016

Papers citing "How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation"

50 / 220 papers shown

Title
Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents Mohammad Kachuee Hao Yuan Young-Bum Kim Sungjin Lee 19 25 0 21 Oct 2020
PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation Clément Rebuffel Laure Soulier Geoffrey Scoutheeten Patrick Gallinari 8 9 0 21 Oct 2020
Local Knowledge Powered Conversational Agents Sashank Santhanam Ming-Yu Liu Raul Puri M. Shoeybi M. Patwary Bryan Catanzaro 21 4 0 20 Oct 2020
Cue Me In: Content-Inducing Approaches to Interactive Story Generation Faeze Brahman Alexandru Petrusca Snigdha Chaturvedi LRM 16 20 0 20 Oct 2020
Reformulating Unsupervised Style Transfer as Paraphrase Generation Kalpesh Krishna John Wieting Mohit Iyyer 19 237 0 12 Oct 2020
Plan ahead: Self-Supervised Text Planning for Paragraph Completion Task Dongyeop Kang Eduard H. Hovy LRM 40 24 0 11 Oct 2020
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions Bodhisattwa Prasad Majumder Harsh Jhamtani Taylor Berg-Kirkpatrick Julian McAuley 22 85 0 07 Oct 2020
Regularizing Dialogue Generation by Imitating Implicit Scenarios Shaoxiong Feng Xuancheng Ren Hongshen Chen Bin Sun Kan Li Xu Sun 18 20 0 05 Oct 2020
MIME: MIMicking Emotions for Empathetic Response Generation Navonil Majumder Pengfei Hong Shanshan Peng Jiankun Lu Deepanway Ghosal Alexander Gelbukh Rada Mihalcea Soujanya Poria 23 200 0 04 Oct 2020
Predicting User Engagement Status for Online Evaluation of Intelligent Assistants Rui Meng Zhen Yue A. Glass 13 2 0 01 Oct 2020
Pchatbot: A Large-Scale Dataset for Personalized Chatbot Hongjin Qian Xiaohe Li Hanxun Zhong Yu Guo Yueyuan Ma Yutao Zhu Zhanliang Liu Zhanliang Liu Ji-Rong Wen 38 43 0 28 Sep 2020
Enhancing Dialogue Generation via Multi-Level Contrastive Learning Xin Li Piji Li Yan Wang Xiaojiang Liu Wai Lam 26 5 0 19 Sep 2020
GLUCOSE: GeneraLized and COntextualized Story Explanations N. Mostafazadeh Aditya Kalyanpur Lori Moon David W. Buchanan Lauren Berkowitz Or Biran Jennifer Chu-Carroll 19 121 0 16 Sep 2020
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation Jian-Yu Guan Minlie Huang 21 69 0 16 Sep 2020
A Survey of Evaluation Metrics Used for NLG Systems Ananya B. Sai Akash Kumar Mohankumar Mitesh M. Khapra ELM 30 228 0 27 Aug 2020
Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce Yang Deng Wenxuan Zhanng Wai Lam 16 31 0 27 Aug 2020
CoreGen: Contextualized Code Representation Learning for Commit Message Generation L. Nie Cuiyun Gao Zhicong Zhong Wai Lam Yang Liu Zenglin Xu 21 46 0 14 Jul 2020
Evaluation of Text Generation: A Survey Asli Celikyilmaz Elizabeth Clark Jianfeng Gao ELM LM&MA 19 376 0 26 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions Stephen Roller Y-Lan Boureau Jason Weston Antoine Bordes Emily Dinan ... Kurt Shuster Eric Michael Smith Arthur Szlam Jack Urbanek Mary Williamson LLMAG AI4CE 22 51 0 22 Jun 2020
Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols Sarah E. Finch Jinho D. Choi ELM 23 67 0 10 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges M. Eskénazi Tiancheng Zhao LLMAG AI4TS AI4CE 36 9 0 10 Jun 2020
Probing Neural Dialog Models for Conversational Understanding Abdelrhman Saleh Tovly Deutsch Stephen Casper Yonatan Belinkov Stuart M. Shieber 21 13 0 07 Jun 2020
Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation Weixin Liang James Zou Zhou Yu ELM 34 33 0 21 May 2020
SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling F. S. Bao Hebi Li Ge Luo Minghui Qiu Yinfei Yang Youbiao He Cen Chen 16 4 0 13 May 2020
Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation Zhiliang Tian Wei Bi Dongkyu Lee Lanqing Xue Yiping Song Xiaojiang Liu N. Zhang 27 25 0 13 May 2020
History for Visual Dialog: Do we really need it? Shubham Agarwal Trung Bui Joon-Young Lee Ioannis Konstas Verena Rieser VLM 13 69 0 08 May 2020
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization Esin Durmus He He Mona T. Diab HILM 6 384 0 07 May 2020
Learning an Unreferenced Metric for Online Dialogue Evaluation Koustuv Sinha Prasanna Parthasarathi Jasmine Wang Ryan J. Lowe William L. Hamilton Joelle Pineau OffRL 21 84 0 01 May 2020
KPQA: A Metric for Generative Question Answering Using Keyphrase Weights Hwanhee Lee Seunghyun Yoon Franck Dernoncourt Doo Soon Kim Trung Bui Joongbo Shin Kyomin Jung 16 0 0 01 May 2020
Question Rewriting for Conversational Question Answering Svitlana Vakulenko Shayne Longpre Zhucheng Tu R. Anantha 20 172 0 30 Apr 2020
Learning to Update Natural Language Comments Based on Code Changes Sheena Panthaplackel Pengyu Nie Miloš Gligorić Junyi Jessy Li Raymond J. Mooney 27 63 0 25 Apr 2020
Experience Grounds Language Yonatan Bisk Ari Holtzman Jesse Thomason Jacob Andreas Yoshua Bengio ... Angeliki Lazaridou Jonathan May Aleksandr Nisnevich Nicolas Pinto Joseph P. Turian 19 350 0 21 Apr 2020
BLEURT: Learning Robust Metrics for Text Generation Thibault Sellam Dipanjan Das Ankur P. Parikh 46 1,442 0 09 Apr 2020
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries Alex Jinpeng Wang Kyunghyun Cho M. Lewis HILM 10 470 0 08 Apr 2020
A Survey on Conversational Recommender Systems Dietmar Jannach A. Manzoor Wanling Cai Li Chen 13 403 0 01 Apr 2020
XPersona: Evaluating Multilingual Personalized Chatbot Zhaojiang Lin Zihan Liu Genta Indra Winata Samuel Cahyawijaya Andrea Madotto Yejin Bang Etsuko Ishii Pascale Fung 45 57 0 17 Mar 2020
Posterior-GAN: Towards Informative and Coherent Response Generation with Posterior Generative Adversarial Network Shaoxiong Feng Hongshen Chen Kan Li Dawei Yin GAN 49 25 0 04 Mar 2020
A Neural Topical Expansion Framework for Unstructured Persona-oriented Dialogue Generation Minghong Xu Piji Li Haoran Yang Pengjie Ren Z. Ren Zhumin Chen Jun Ma 18 31 0 06 Feb 2020
Towards a Human-like Open-Domain Chatbot Daniel De Freitas Minh-Thang Luong David R. So Jamie Hall Noah Fiedel ... Zi Yang Apoorv Kulshreshtha Gaurav Nemade Yifeng Lu Quoc V. Le 30 923 0 27 Jan 2020
Paraphrase Generation with Latent Bag of Words Yao Fu Yansong Feng John P. Cunningham BDL 25 91 0 07 Jan 2020
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity Huiyuan Xie Tom Sherborne A. Kuhnle Ann A. Copestake DiffM 19 9 0 19 Dec 2019
Plug and Play Language Models: A Simple Approach to Controlled Text Generation Sumanth Dathathri Andrea Madotto Janice Lan Jane Hung Eric Frank Piero Molino J. Yosinski Rosanne Liu KELM 26 937 0 04 Dec 2019
Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context Yichi Zhang Zhijian Ou Zhou Yu 19 182 0 24 Nov 2019
Social Bias Frames: Reasoning about Social and Power Implications of Language Maarten Sap Saadia Gabriel Lianhui Qin Dan Jurafsky Noah A. Smith Yejin Choi 28 483 0 10 Nov 2019
Automatic Reminiscence Therapy for Dementia Mariona Carós M. Garolera P. Radeva Xavier Giró-i-Nieto 21 40 0 25 Oct 2019
Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models Tianxing He Jun Liu Kyunghyun Cho Myle Ott Bing-Quan Liu James R. Glass Fuchun Peng CLL 29 9 0 16 Oct 2019
Learning from Fact-checkers: Analysis and Generation of Fact-checking Language Nguyen Vo Kyumin Lee 9 68 0 05 Oct 2019
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs Yi-Lin Tuan Yun-Nung (Vivian) Chen Hung-yi Lee 18 71 0 01 Oct 2019
Do Massively Pretrained Language Models Make Better Storytellers? A. See Aneesh S. Pappu Rohun Saxena Akhila Yerukola Christopher D. Manning 37 166 0 24 Sep 2019
Counterfactual Story Reasoning and Generation Lianhui Qin Antoine Bosselut Ari Holtzman Chandra Bhagavatula Elizabeth Clark Yejin Choi LRM 11 140 0 09 Sep 2019