ECCV 2020 Daily

2 Oral Presentation 8 DAILY M o n d a y Stefan Popov works as a software engineer at Google Research in Zürich. His manager is Vittorio Ferrari, a General co-Chair at this year’s ECCV. He speaks to us ahead of his oral today (Monday), in which he presents his paper on single-image 3D scene reconstruction. This work is about reconstructing 3D scenes from a single RGB image without any extra input . It proposes to support scenes with multiple objects that occlude each other. The network will resolve occlusions by hallucinating the missing parts. The model will be useful for image editing – when you want to remove an object or add another object to an image. Also, precise full-scene 3D reconstruction from a single image can enable structured queries in images. For example, ‘Find all images that have three cups in a row on a table, with a chair behind the table.’ This sort of question can be difficult to answer with only 2D scene understanding . Stefan tells us the biggest challenge they faced was finding a dataset to use as there are not many datasets featuring multiple objects with realistic images. The one dataset they did find, SunCG , was banned due to legal issues while they were using it. In the end, they had to create their own. What computer vision techniques did they use? “We have an architecture which uses a standard network ResNet-50 to analyse what is on the image,” Stefan explains. “Then we built our custom decoder out of this, which starts from the analysis of the image and then expands it with transposed 3D convolutions . This is the base of the network.” Building on previous work that reconstructed scenes with a single CoReNet: Coherent 3D Scene Reconstruction from a Single RGB Image “This sort of question can be difficult to answer with only 2D scene understanding.”

ECCV 2020 Daily - Monday