VideoGAN-based Trajectory Proposal for Automated Vehicles

19 June 2025

Main:10 Pages

8 Figures

Bibliography:5 Pages

Abstract

Being able to generate realistic trajectory options is at the core of increasing the degree of automation of road vehicles. While model-driven, rule-based, and classical learning-based methods are widely used to tackle these tasks at present, they can struggle to effectively capture the complex, multimodal distributions of future trajectories. In this paper we investigate whether a generative adversarial network (GAN) trained on videos of bird's-eye view (BEV) traffic scenarios can generate statistically accurate trajectories that correctly capture spatial relationships between the agents. To this end, we propose a pipeline that uses low-resolution BEV occupancy grid videos as training data for a video generative model. From the generated videos of traffic scenarios we extract abstract trajectory data using single-frame object detection and frame-to-frame object matching. We particularly choose a GAN architecture for the fast training and inference times with respect to diffusion models. We obtain our best results within 100 GPU hours of training, with inference times under 20\,ms. We demonstrate the physical realism of the proposed trajectories in terms of distribution alignment of spatial and dynamic parameters with respect to the ground truth videos from the Waymo Open Motion Dataset.

View on arXiv

@article{mariani2025_2506.16209,
  title={ VideoGAN-based Trajectory Proposal for Automated Vehicles },
  author={ Annajoyce Mariani and Kira Maag and Hanno Gottschalk },
  journal={arXiv preprint arXiv:2506.16209},
  year={ 2025 }
}

Comments on this paper