Computer Vision News - October 2019

Research Everymonth, Computer VisionNews reviews a research paper from our field. This month we have chosen GSLAM: Initialization- robust Monocular Visual SLAM via Global Structure-from-Motion. We are indebted to theauthors(ChengzhouTang,OliverWangand Ping Tan), for allowing us to use their images. by Amnon Geifman Simultaneous Localization and Mapping (SLAM) is one of the hottest topics in computer vision today. The goal in this task is to jointly solve for both, the camera motion and 3D points of an unknown environment from a single input video. State of the art methods usually divide the solution into three threads: tracking- extracting features to find correspondences, local mapping- the SfM pipeline that reconstruct position orientation and 3D points of the scene, and loop closing- refining the solution when the camera returns to a previews position. Most of the visual odometry methods that are used in SLAM exploit incremental SfM to initialize the 3D map and camera positions. The paper that we are reviewing this month shows how to incorporate a global SfM into a novel pipeline to solve the local mapping problem. The monocular SLAM system in GSLAM is composed of three components: feature tracking, visual odometry pipeline, and pose graph optimization. Similarly to ORB slam, the tracking part uses ORB features, a feature extraction method that is much faster than classic methods (i.e SIFT, HOG, SURF). In turn, this enables to solve the matching problem in real time speed without compromising on accuracy. In order to avoid the processing of all the frames in the video, a selection of keyframes is needed. The authors suggest a new approach to choose key frames which aggregate a window of keyframes (see figure below). 4