Density-based Object Detection: Learning Bounding Boxes without Ground Truth Assignment

IEEE International Conference on Computer Vision (ICCV), 2019

28 November 2019

ArXiv (abs)PDF HTML Github (11★)

Abstract

In multi-object detection using neural networks, most methods train a network based on ground truth assignment, which makes the training too heuristic and complicated. In this paper, we reformulate the multi-object detection task as a problem of density estimation of bounding boxes. Instead of using a ground-truth-assignment-based method, we train a network by estimating the probability density of bounding boxes in an input image using a mixture model. For this purpose, we propose a novel network for object detection called Mixture Density Object Detector (MDOD), and the corresponding objective function for the density-estimation-based training. Unlike the ground-truth-assignment-based methods, our proposed method gets rid of the cumbersome processes of matching between ground truth boxes and their predictions as well as the heuristic anchor design. It is also free from the problem of foreground-background imbalance. We applied MDOD to MS COCO dataset. Our proposed method not only deals with multi-object detection problems in a new approach, but also improves detection performances through MDOD. Code will be available.

View on arXiv

Comments on this paper