18
1

DialSim: A Real-Time Simulator for Evaluating Long-Term Multi-Party Dialogue Understanding of Conversation Systems

Abstract

Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of conversation systems, making them applicable to various fields (e.g., education). Despite their progress, the evaluation of the systems often overlooks the complexities of real-world conversations, such as real-time interactions, multi-party dialogues, and extended contextual dependencies. To bridge this gap, we introduce DialSim, a real-time dialogue simulator. In this simulator, a conversation system is assigned the role of a character from popular TV shows, requiring it to respond to spontaneous questions using past dialogue information and to distinguish between known and unknown information. Key features of DialSim include assessing the system's ability to respond within a reasonable time limit, handling long-term multi-party dialogues, and evaluating performance under randomized questioning with LongDialQA, a novel, high-quality question-answering dataset. Our experiments using DialSim reveal the strengths and weaknesses of the latest conversation systems, offering valuable insights for future advancements in conversational AI. DialSim is available atthis https URL.

View on arXiv
@article{kim2025_2406.13144,
  title={ DialSim: A Real-Time Simulator for Evaluating Long-Term Multi-Party Dialogue Understanding of Conversation Systems },
  author={ Jiho Kim and Woosog Chay and Hyeonji Hwang and Daeun Kyung and Hyunseung Chung and Eunbyeol Cho and Yohan Jo and Edward Choi },
  journal={arXiv preprint arXiv:2406.13144},
  year={ 2025 }
}
Comments on this paper