标题:Geometric loss functions for camera pose regression with deep learning
作者:Alex Kendall and Roberto Cipolla



Deep learning has shown to be effective for robust andreal-time monocular image relocalisation. In particular,PoseNet [22] is a deep convolutional neural network whichlearns to regress the 6-DOF camera pose from a single image. It learns to localize using high level features and isrobust to difficult lighting, motion blur and unknown camera intrinsics, where point based SIFT registration fails.However, it was trained using a naive loss function, withhyper-parameters which require expensive tuning. In thispaper, we give the problem a more fundamental theoretical treatment. We explore a number of novel loss functionsfor learning camera pose which are based on geometry andscene reprojection error. Additionally we show how to automatically learn an optimal weighting to simultaneouslyregress position and orientation. By leveraging geometry,we demonstrate that our technique significantly improvesPoseNet’s performance across datasets ranging from indoorrooms to a small city.