88

A Style-Based Metric for Quantifying the Synthetic-to-Real Gap in Autonomous Driving Image Datasets

Main:5 Pages
4 Figures
Bibliography:2 Pages
Abstract

Ensuring the reliability of autonomous driving perception systems requires extensive environment-based testing, yet real-world execution is often impractical. Synthetic datasets have therefore emerged as a promising alternative, offering advantages such as cost-effectiveness, bias free labeling, and controllable scenarios. However, the domain gap between synthetic and real-world datasets remains a critical bottleneck for the generalization of AI-based autonomous driving models. Quantifying this synthetic-to-real gap is thus essential for evaluating dataset utility and guiding the design of more effective training pipelines. In this paper, we establish a systematic framework for quantifying the synthetic-to-real gap in autonomous driving systems, and propose Style Embedding Distribution Discrepancy (SEDD) as a novel evaluation metric. Our framework combines Gram matrix-based style extraction with metric learning optimized for intra-class compactness and inter-class separation to extract style embeddings. Furthermore, we establish a benchmark using publicly available datasets. Experiments are conducted on a variety of datasets and sim-to-real methods, and the results show that our method is capable of quantifying the synthetic-to-real gap. This work provides a standardized quality control tool that enables systematic diagnosis and targeted enhancement of synthetic datasets, advancing future development of data-driven autonomous driving systems.

View on arXiv
Comments on this paper