v1v2 (latest)

Diversifying Human Pose in Synthetic Data for Aerial-view Human Detection

24 May 2024

Yingzhe Shen

Hyungtae Lee

Heesung Kwon

Shuvra S. Bhattacharyya

ArXiv (abs)PDF HTML

Main:5 Pages

8 Figures

Bibliography:1 Pages

1 Tables

Abstract

Synthetic data generation has emerged as a promising solution to the data scarcity issue in aerial-view human detection. However, creating datasets that accurately reflect varying real-world human appearances, particularly diverse poses, remains challenging and labor-intensive. To address this, we propose SynPoseDiv, a novel framework that diversifies human poses within existing synthetic datasets. SynPoseDiv tackles two key challenges: generating realistic, diverse 3D human poses using a diffusion-based pose generator, and producing images of virtual characters in novel poses through a source-to-target image translator. The framework incrementally transitions characters into new poses using optimized pose sequences identified via Dijkstra's algorithm. Experiments demonstrate that SynPoseDiv significantly improves detection accuracy across multiple aerial-view human detection benchmarks, especially in low-shot scenarios, and remains effective regardless of the training approach or dataset size.

View on arXiv

Comments on this paper