Integrating robotics into everyday scenarios like tutoring or physical training requires robots capable of adaptive, socially engaging, and goal-oriented interactions. While Large Language Models show promise in human-like communication, their standalone use is hindered by memory constraints and contextual incoherence. This work presents a multimodal, cognitively inspired framework that enhances LLM-based autonomous decision-making in social and task-oriented Human-Robot Interaction. Specifically, we develop an LLM-based agent for a robot trainer, balancing social conversation with task guidance and goal-driven motivation. To further enhance autonomy and personalization, we introduce a memory system for selecting, storing and retrieving experiences, facilitating generalized reasoning based on knowledge built across different interactions. A preliminary HRI user study and offline experiments with a synthetic dataset validate our approach, demonstrating the system's ability to manage complex interactions, autonomously drive training tasks, and build and retrieve contextual memories, advancing socially intelligent robotics.
View on arXiv@article{garello2025_2504.01588, title={ Building Knowledge from Interactions: An LLM-Based Architecture for Adaptive Tutoring and Social Reasoning }, author={ Luca Garello and Giulia Belgiovine and Gabriele Russo and Francesco Rea and Alessandra Sciutti }, journal={arXiv preprint arXiv:2504.01588}, year={ 2025 } }