CVPR Daily - Tuesday

3 DAILY CVPR Tuesday Highlight - SPARF NeRF (Neural Radiance Fields) is a cutting-edge technology with remarkable potential in generating 3D reconstructions and rendering novel views. NeRF has been shown to work best under two conditions: dense coverage of the 3D space and highly accurate camera poses. This scenario limits its application in the real world, where input views are often sparse, and poses are noisy. “When you train a NeRF model on only a few images, it will instantly overfit,” Prune explains. “You’ll have very nice training renderings, RGB will be good, and the photometric loss will be low, but when you look at the depth renderings, you’ll see that the model doesn’t learn any meaningful geometry. Therefore, it will be really bad if you try to render a novel view. ” The standard process to estimate per-scene poses is to use a Structure-from-Motion approach, such as COLMAP , which works well with many input views. However, with fewer views or an increased baseline between the images, it becomes much more challenging, and the pose estimation results are degraded. Prune Truong is a fourth-year PhD student in the Computer Vision Lab at ETH Zurich and an intern in Federico Tombari’s team at Google Zurich, where Fabian Manhardt is a Research Scientist. Prune and Fabian speak to us about their paper proposing a new joint pose-NeRF training strategy designed to be more robust in the real world, which has been accepted as a highlight at CVPR 2023. They speaks to us ahead of their highlight presentation today. SPARF - NeRF from Sparse and Noisy Poses Fabian Manhardt Prune Truong