Abstract
Monocular visual odometry is a core component of visual Simultaneous Localization and Mapping (SLAM). Nowadays, headsets with a forward-pointing camera abound for a wide range of use cases such as extreme sports, firefighting or military interventions. Many of these headsets do not feature additional sensors such as a stereo camera or an IMU, thus evaluating the accuracy and robustness of monocular odometry remains critical. In this paper, we develop a novel framework for procedural synthetic dataset generation and a dedicated motion model for headset-mounted cameras. With our method, we study the performance of the leading classes of monocular visual odometry algorithms, namely feature-based, direct and deep learning-based methods. Our experiments lead to the following conclusions: i) the performance deterioration on headset-mounted camera images is mostly caused by head rotations and not by translations caused by human walking style, ii) feature-based methods are more robust to fast head rotations compared to direct and deep learning-based methods, and iii) it is crucial to develop uncertainty metrics for deep learning-based odometry algorithms.
| Original language | English |
|---|---|
| Pages (from-to) | 836-843 |
| Number of pages | 8 |
| Journal | Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications |
| Volume | 5 |
| DOIs | |
| Publication status | Published - 2022 |
| Event | 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2022 - Virtual, Online Duration: 6 Feb 2022 → 8 Feb 2022 |
Keywords
- Head Motion
- Monocular Visual Odometry
- Synthetic Data