CVPR Daily - Tuesday

Previous Page

Next Page

Page Background

other is the impact it can have. The

impact it can have is that we will one

day be able to train from tens of

millions - no, billions - of training

examples that cover tens of thousands

of object detectors. This is in fact

necessary to reach human-level ability,

you need lots of samples and lots of

classes. And one day we will be able to

do that at a cost that is within the

ability… maybe not of everyone, but at

least within the ability of a millionaire

[he laughs], in the million dollar range.

Now this would be absolutely

impossible, if you want a complete

annotation of every pixel in an image.

You cannot do a million objects,

basically. The problem will only be

solvable once we have all the

annotated data, and we will not

annotate it by hand the way we are

doing it in the fully supervised world.

That’s why reducing the annotation

time is not just a sport, it’s an enabler

of solving computer vision. Now if you

go to the scientific reason, which I am

even more passionate about, it is a

very interesting information-theory

type of trap. When you have a weakly

supervised learning problem, let’s say,

this image where we stand now: this is

a couch, this is Ralph, this is Vitto, this

is a plant in Hawaii. You have these

labels,

and there is actually

combinatorially many assignments of

the pixels in the image to these labels.

And all of them are consistent with the

labelling of the image, but some of

them make more sense in terms of

regularity. It’s very interesting that

theoretically, there are many solutions

that are valid, so that strictly and

information-theoretically speaking, it is

impossible to reconstruct pixel-level

labelling of an image from image-level

labels. And yet, there exist some

assignments that are more likely to

make sense in the visual world. For

instance, all the pixels on your face

probably all take the same label,

they’re all face. For me it is very

exciting that although we know that

there is no perfect closed-form

solution that will work, there is certain

families that make more sense in the

visual world and that lead to good

results at test time. So somehow I like

the fact that you start by saying that

the problem is impossible, and yet you

try to solve it.

You sound still as passionate as when

you started to study…

Oh, I am more passionate now! When I

started my PhD, I felt like a kid in a

candy store. You jump at everything

that looks cool, and you grab

something, lick it a bit, then you take

something else… so there is no

continuity of mission. Now I am

equally motivated, but because I

focused the energy of my team over

multiple years on a family of problems,

I also see a lot more progress. And I

appreciate the fine details of these

families of problems. So in fact I

actually feel more passionate now

compared to when I started.

Do you have tips on how to keep the

passion over a long period of time?

Vittorio Ferrari

5

“Oh, I am more passionate now! ”

“

Like a kid in a

candy store…

”