v1v2 (latest)

Dialogs Re-enacted Across Languages

18 November 2022

Abstract

To support machine learning of cross-language prosodic mappings and other ways to improve speech-to-speech translation, we present a protocol for collecting closely matched pairs of utterances across languages, a description of the resulting data collection and its public release, and some observations and musings. This report is intended for: people using this corpus, people extending this corpus, and people designing similar collections of bilingual dialog data.

View on arXiv

Comments on this paper