Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1603.08023
Cited By
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
25 March 2016
Chia-Wei Liu
Ryan J. Lowe
Iulian Serban
Michael Noseworthy
Laurent Charlin
Joelle Pineau
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation"
50 / 220 papers shown
Title
Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding
Yifeng Di
Tianyi Zhang
26
0
0
12 May 2025
BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation
Suvodip Dey
M. Desarkar
OffRL
41
0
0
20 Jan 2025
Measuring the Robustness of Reference-Free Dialogue Evaluation Systems
Justin Vasselli
Adam Nohejl
Taro Watanabe
AAML
49
0
0
12 Jan 2025
AutoSAM: Towards Automatic Sampling of User Behaviors for Sequential Recommender Systems
H. Zhang
Mingyue Cheng
Qi Liu
Ziqiang Liu
Junzhe Jiang
Enhong Chen
AI4TS
46
3
0
03 Jan 2025
LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts
Helia Hashemi
J. Eisner
Corby Rosset
Benjamin Van Durme
Chris Kedzie
68
1
0
03 Jan 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
120
67
0
25 Nov 2024
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
63
23
0
10 Sep 2024
ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues
John Mendonça
Isabel Trancoso
A. Lavie
34
3
0
16 Jul 2024
Leveraging LLMs for Dialogue Quality Measurement
Jinghan Jia
A. Komma
Timothy Leffel
Xujun Peng
Ajay Nagesh
Tamer Soliman
Aram Galstyan
Anoop Kumar
34
5
0
25 Jun 2024
Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation
Adam Fisch
Joshua Maynez
R. A. Hofer
Bhuwan Dhingra
Amir Globerson
William W. Cohen
41
8
0
06 Jun 2024
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
Varun Magesh
Faiz Surani
Matthew Dahl
Mirac Suzgun
Christopher D. Manning
Daniel E. Ho
HILM
ELM
AILaw
27
66
0
30 May 2024
Apollonion: Profile-centric Dialog Agent
Shangyu Chen
Zibo Zhao
Yuanyuan Zhao
Xiang Li
LLMAG
40
1
0
10 Apr 2024
A Survey of Personality, Persona, and Profile in Conversational Agents and Chatbots
Richard Sutcliffe
30
3
0
31 Dec 2023
Partially Randomizing Transformer Weights for Dialogue Response Diversity
Jing Yang Lee
Kong Aik Lee
Woon-Seng Gan
23
0
0
18 Nov 2023
Learning Personalized Alignment for Evaluating Open-ended Text Generation
Danqing Wang
Kevin Kaichuang Yang
Hanlin Zhu
Xiaomeng Yang
Andrew Cohen
Lei Li
Yuandong Tian
ALM
LM&MA
17
8
0
05 Oct 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Qingyue Wang
Y. Fu
Yanan Cao
Zhiliang Tian
Shi Wang
Dacheng Tao
LLMAG
KELM
RALM
59
24
0
29 Aug 2023
Three Ways of Using Large Language Models to Evaluate Chat
Ondvrej Plátek
Vojtvech Hudevcek
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
ALM
19
6
0
12 Aug 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
Schema-Guided User Satisfaction Modeling for Task-Oriented Dialogues
Yue Feng
Yunlong Jiao
Animesh Prasad
Nikolaos Aletras
Emine Yilmaz
G. Kazai
22
5
0
26 May 2023
Psychological Metrics for Dialog System Evaluation
Salvatore Giorgi
Shreya Havaldar
Farhan S. Ahmed
Zuhaib Akhtar
Shalaka Vaidya
Gary Pan
Pallavi V. Kulkarni
H. A. Schwartz
Joao Sedoc
22
2
0
24 May 2023
Dialogue Games for Benchmarking Language Understanding: Motivation, Taxonomy, Strategy
David Schlangen
ELM
24
13
0
14 Apr 2023
CTRLStruct: Dialogue Structure Learning for Open-Domain Response Generation
Congchi Yin
Pijian Li
Z. Ren
31
11
0
02 Mar 2023
Improving Open-Domain Dialogue Evaluation with a Causal Inference Model
Cat P. Le
Luke Dai
Michael Johnston
Yang Liu
M. Walker
R. Ghanadan
ELM
19
10
0
31 Jan 2023
Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm
Jabri Ismail
Aboulbichr Ahmed
El ouaazizi Aziza
16
2
0
28 Dec 2022
CausalDialogue: Modeling Utterance-level Causality in Conversations
Yi-Lin Tuan
Alon Albalak
Wenda Xu
Michael Stephen Saxon
Connor Pryor
Lise Getoor
William Yang Wang
CML
29
2
0
20 Dec 2022
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
58
98
0
19 Dec 2022
PAL: Persona-Augmented Emotional Support Conversation Generation
Jiale Cheng
Sahand Sabour
Hao Sun
Zhuang Chen
Minlie Huang
19
27
0
19 Dec 2022
PVGRU: Generating Diverse and Relevant Dialogue Responses via Pseudo-Variational Mechanism
Yongkang Liu
Shi Feng
Daling Wang
Yifei Zhang
Hinrich Schütze
28
6
0
18 Dec 2022
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
26
7
0
18 Dec 2022
A Survey on Natural Language Processing for Programming
Qingfu Zhu
Xianzhen Luo
Fang Liu
Cuiyun Gao
Wanxiang Che
23
1
0
12 Dec 2022
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey
Yuxin Wang
Jieru Lin
Zhiwei Yu
Wei Hu
Börje F. Karlsson
20
17
0
09 Dec 2022
Deep Fake Detection, Deterrence and Response: Challenges and Opportunities
Amin Azmoodeh
Ali Dehghantanha
42
2
0
26 Nov 2022
CDialog: A Multi-turn Covid-19 Conversation Dataset for Entity-Aware Dialog Generation
Deeksha Varshney
Aizan Zafar
Niranshu Kumar Behra
Asif Ekbal
21
6
0
16 Nov 2022
Multi-VQG: Generating Engaging Questions for Multiple Images
Min-Hsuan Yeh
Vicent Chen
Ting-Hao Haung
Lun-Wei Ku
CoGe
18
7
0
14 Nov 2022
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible Knowledge Selection
Lanrui Wang
JiangNan Li
Zheng Lin
Fandong Meng
Chenxu Yang
Weiping Wang
Jie Zhou
18
30
0
21 Oct 2022
Controllable Fake Document Infilling for Cyber Deception
Yibo Hu
Yu Lin
Eric Parolin
Latif Khan
Kevin W. Hamlen
32
8
0
18 Oct 2022
Dialogue Evaluation with Offline Reinforcement Learning
Nurul Lubis
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Michael Heck
Shutong Feng
Milica Gavsić
OffRL
19
4
0
02 Sep 2022
Towards Boosting the Open-Domain Chatbot with Human Feedback
Hua Lu
Siqi Bao
H. He
Fan Wang
Hua-Hong Wu
Haifeng Wang
ALM
20
18
0
30 Aug 2022
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation
Cyril Chhun
Pierre Colombo
Chloé Clavel
Fabian M. Suchanek
53
50
0
24 Aug 2022
SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation
Longxuan Ma
Ziyu Zhuang
Weinan Zhang
Mingda Li
Ting Liu
26
4
0
17 Aug 2022
Why is constrained neural language generation particularly challenging?
Cristina Garbacea
Qiaozhu Mei
59
14
0
11 Jun 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
19
50
0
01 Jun 2022
Commonsense and Named Entity Aware Knowledge Grounded Dialogue Generation
Deeksha Varshney
Akshara Prabhakar
Asif Ekbal
27
18
0
27 May 2022
A Question-Answer Driven Approach to Reveal Affirmative Interpretations from Verbal Negations
Md Mosharaf Hossain
L. Holman
Anusha Kakileti
T. Kao
N. Brito
A. Mathews
Eduardo Blanco
26
3
0
23 May 2022
Computational Storytelling and Emotions: A Survey
Yusuke Mori
Hiroaki Yamane
Yusuke Mukuta
Tatsuya Harada
35
2
0
23 May 2022
CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models
Bishal Santra
Ravi Ghadia
Manish Gupta
Pawan Goyal
OffRL
20
0
0
21 May 2022
Target-Guided Dialogue Response Generation Using Commonsense and Data Augmentation
Prakhar Gupta
Harsh Jhamtani
Jeffrey P. Bigham
46
12
0
19 May 2022
Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets
Philippe Laban
Chien-Sheng Wu
Wenhao Liu
Caiming Xiong
38
5
0
13 May 2022
Vector Representations of Idioms in Conversational Systems
Tosin P. Adewumi
F. Liwicki
Marcus Liwicki
30
8
0
07 May 2022
Balancing Multi-Domain Corpora Learning for Open-Domain Response Generation
Yujie Xing
Jason (Jinglun) Cai
Nils Barlaug
Peng Liu
J. Gulla
29
4
0
05 May 2022
1
2
3
4
5
Next