Relative Camera Pose Estimation Using Convolutional Neural Networks

Advanced Concepts for Intelligent Vision Systems Conference (ACIVS), 2017

5 February 2017

Abstract

We present a method for estimating relative camera pose between a pair of images. The goal is to propose accurate estimations the relative orientation vector representing by rotation matrix and translation vector of two cameras capturing the same scene. Our approach is based on convolutional neural networks and directly estimates camera motion between two RGB images by solving regression problem. The proposed network is trained in an end-to-end manner utilizing transfer learning from large scale classification data. The method is compared to a classical local feature based pipeline (SURF, ORB) of relative pose estimation and we demonstrate the cases where our deep model outperforms the traditional approach significantly. Finally, we evaluated experiments with applying Spatial Pyramid Pooling (SPP) layer which can produce a fixed-size representation regardless the size of the input image. The results confirm that SPP further improves the performance of the proposed approach.

View on arXiv

Comments on this paper