v1v2 (latest)

"Hiding in Plain Sight": Designing Synthetic Dialog Generation for Uncovering Socially Situated Norms

1 October 2024

Chengfei Wu

Dan Goldwasser

ArXiv (abs)PDF HTML

Main:2 Pages

6 Figures

Bibliography:3 Pages

6 Tables

Appendix:9 Pages

Abstract

Naturally situated conversations encapsulate the social norms inherent to their context, reflecting both the relationships between interlocutors and the underlying communicative intent. In this paper, we propose a novel, multi-step framework for generating dialogues that automatically uncovers social norms from rich, context-laden interactions through a process of self-assessment and norm discovery, rather than relying on predefined norm labels. Leveraging this framework, we construct NormHint, a comprehensive synthetic dialogue dataset spanning a wide range of interlocutor attributes (e.g., age, profession, personality), relationship types, conversation topics, and conversational trajectories. NormHint is meticulously annotated with turn-level norm violation information, detailed participant descriptions, and remediation suggestions-including alternative trajectories achieved through early intervention. Human validation and automated analysis demonstrate that our dataset captures diverse conversational topics with high naturalness and realism. Moreover, we discovered that fine-tuning a model with our norm violation data significantly enhances its ability to detect and understand potential norm violations in conversations.

View on arXiv

Comments on this paper