Previous Page  18 / 56 Next Page
Information
Show Menu
Previous Page 18 / 56 Next Page
Page Background

She says that usually if you train

something on synthetic data, it is going

to not work as well on real data

because it hasn’t seen data

characteristics like the real world. “

The

real world is noisy and partial

”, Angela

says, “

because you can’t get a fully

complete view of an object

”. They

therefore have some benchmarks to

show that if you train for real data

tasks, you can do much better than on

training just on synthetic data. “

So it

matters a lot that we have real-world

data, and we could definitely still use

more

”. With ScanNet and the current

1,500 scans they provide a good start,

and Angela told us that they were able

to show that they are able to

generalise to some previous real-world

datasets that were smaller. “

But of

course, we would like to get more, and

this is still what we are working on

.”

In the future, Angela would like to see

something running on a tangible thing.

She previously worked on 3D

reconstruction, where they built a real-

time 3D reconstruction system, but

from this work she knows that it’s very

hard to use this kind of thing in

practise. One of the things that is

missing, she told us, is to be able to get

a semantic understanding of the scene.

Because even when a model looks

reasonable, you still want to know

where things are, or what they are, in

order to actually at least have virtual

agents or robots interact with them. “

I

want to be able to make this happen

for real

scans!

”,

she adds

enthusiastically.

Angela also told us about the next

steps in line of this work. One of the

“obvious things” is to scale this up

even larger than thousands of scans -

they aim to go up to ten thousands.

This however requires a different kind

of data acquisition she noted, where

instead of only crowdsourcing the

annotation task, they also want to be

able to crowdsource the

reconstruction task as well. Besides

this, there is also a lot to be done in

terms of semantic segmentations.

Right now, our tasks are still basically:

What are objects?

”, Angela explains,

and there is a lot more interesting

tasks on this type of data

”.

One of them she is particularly

interesting is connecting the real-world

data with synthetic CAD models. They

did this a little bit with ScanNet, but

they want to push forward, to have an

association with synthetic CAD models

with the real-world scans. E.g., when

you align a synthetic chair on top of

the real chair, and then correlate these

two. “

Ideally, you can basically learn a

transform to go between real to

synthetic

”, Angela says. And this is a

way you can make a model useable -

since synthetic models are easy to

manipulate and they are fully

complete. It is also much easier to train

something on synthetic data, but it’s

not easy to transfer that information to

the real world. But if you had this

correlation between the two, then it

could be possible to learn the transfer

between synthetic and real data. A

method like this might be usable in a

VR/AR application.

“A semantic

understanding

of the scene”

Angela Dai

18

Sunday

“What are objects?”

BEST OF CVPR