

“
There’s a lot more problems that are
conditional
than unconditional,
especially practical
problems in
computer vision and graphics
”, Phillip
told us. For example semantic
segmentation or edge detection are
both conditional
image-to-image
mapping problems, or things like image
colourisation (taking a black-and-white
photo and producing a coloured
version of it). In all of these problems
you want to learn a mapping from
pixel-to-pixels, i.e., images-to-images.
Phillip explained to us that what
happened in the last couple of years is
that CNNs have turned out to be a very
generic way of processing images and
are used for a lot of problems. But
usually a CNN is only modelling
structure in the input space. CNNs with
the standard regression loss are
treating every output pixel (of the
semantic segmentation map or edge
map) as conditionally independent
given the input, so they don’t model
semantic structure in the output
space. As a reaction, the community
has already done a lot of structured
regression problems modelling
structure in the output space, for
example using conditional random
fields. But what Phillip’s current work is
doing is
using adversarial
discriminators as a way of learning a
structured loss function to model
structure in the output space. You thus
have a neural network that models
structure in the input space, and a
neural network that models structure
in the output space, to do generic
things that can process images. “A year
ago this was all very new and
unexpected. The field has developed
these ideas all together and we are
one of them.” In their paper, Phillip
and his co-authors show that this kind
of approach is suitable for many
image-to-image mappings, and they
demonstrate that this works well on a
lot of problems without any change in
the architecture or method.
Phillip also told us about some insights
they got from working on this
problem: “The process was that we
added a bunch of bells and whistles
and got something working, and then
realised we could remove almost all
the bells and whistles”.
Phillip Isola
10
Tuesday“We then realised that we
could remove almost all
the bells and whistles”