CVPR Daily - Tuesday

Previous Page

Next Page

Page Background

“

There’s a lot more problems that are

conditional

than unconditional,

especially practical

problems in

computer vision and graphics

”, Phillip

told us. For example semantic

segmentation or edge detection are

both conditional

image-to-image

mapping problems, or things like image

colourisation (taking a black-and-white

photo and producing a coloured

version of it). In all of these problems

you want to learn a mapping from

pixel-to-pixels, i.e., images-to-images.

Phillip explained to us that what

happened in the last couple of years is

that CNNs have turned out to be a very

generic way of processing images and

are used for a lot of problems. But

usually a CNN is only modelling

structure in the input space. CNNs with

the standard regression loss are

treating every output pixel (of the

semantic segmentation map or edge

map) as conditionally independent

given the input, so they don’t model

semantic structure in the output

space. As a reaction, the community

has already done a lot of structured

regression problems modelling

structure in the output space, for

example using conditional random

fields. But what Phillip’s current work is

doing is

using adversarial

discriminators as a way of learning a

structured loss function to model

structure in the output space. You thus

have a neural network that models

structure in the input space, and a

neural network that models structure

in the output space, to do generic

things that can process images. “A year

ago this was all very new and

unexpected. The field has developed

these ideas all together and we are

one of them.” In their paper, Phillip

and his co-authors show that this kind

of approach is suitable for many

image-to-image mappings, and they

demonstrate that this works well on a

lot of problems without any change in

the architecture or method.

Phillip also told us about some insights

they got from working on this

problem: “The process was that we

added a bunch of bells and whistles

and got something working, and then

realised we could remove almost all

the bells and whistles”.

Phillip Isola

10

“We then realised that we

could remove almost all

the bells and whistles”