Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models

27 February 2025

Abstract

Recent advancements in multi-turn voice interaction models have improved user-model communication. However, while closed-source models effectively retain and recall past utterances, whether open-source models share this ability remains unexplored. To fill this gap, we systematically evaluate how well open-source interaction models utilize past utterances using ContextDialog, a benchmark we proposed for this purpose. Our findings show that speech-based models have more difficulty than text-based ones, especially when recalling information conveyed in speech, and even with retrieval-augmented generation, models still struggle with questions about past utterances. These insights highlight key limitations in open-source models and suggest ways to improve memory retention and retrieval robustness.

View on arXiv

@article{kim2025_2502.19759,
  title={ Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models },
  author={ Heeseung Kim and Che Hyun Lee and Sangkwon Park and Jiheum Yeom and Nohil Park and Sangwon Yu and Sungroh Yoon },
  journal={arXiv preprint arXiv:2502.19759},
  year={ 2025 }
}

Comments on this paper