We introduce , a novel framework for generating insights about LLM chatbot interactions with rigorous differential privacy (DP) guarantees. The framework employs a private clustering mechanism and innovative keyword extraction methods, including frequency-based, TF-IDF-based, and LLM-guided approaches. By leveraging DP tools such as clustering, partition selection, and histogram-based summarization, provides end-to-end privacy protection. Our evaluation assesses lexical and semantic content preservation, pair similarity, and LLM-based metrics, benchmarking against a non-private Clio-inspired pipeline (Tamkin et al., 2024). Moreover, we develop a simple empirical privacy evaluation that demonstrates the enhanced robustness of our DP pipeline. The results show the framework's ability to extract meaningful conversational insights while maintaining stringent user privacy, effectively balancing data utility with privacy preservation.
View on arXiv@article{liu2025_2506.04681, title={ Urania: Differentially Private Insights into AI Use }, author={ Daogao Liu and Edith Cohen and Badih Ghazi and Peter Kairouz and Pritish Kamath and Alexander Knop and Ravi Kumar and Pasin Manurangsi and Adam Sealfon and Da Yu and Chiyuan Zhang }, journal={arXiv preprint arXiv:2506.04681}, year={ 2025 } }