Computer Vision News - October 2018

The Decoder If you come with a normal conception of a decoder in mind, you may find the term decoder confusing on first reading the article. As it consists, in fact, in the concatenation of the output of a single layer (the second entry block), as you can see in the figure below (Encoder-Decoder with Atrous Conv). But looking closely at the code you can better understand the reason. In line 466 we see the output of the 1x1 convolution (marked by a green rectangle at the end of part II in the Encoder-Decoder figure above). And in line 477 we see the upsampling by a factor of 4. In line 484 we see the Decoder concatenation (marked IV in the Encoder-Decoder figure above). The implementation above also does a good job of explaining figure 1 of the article -- the decoder here is basically just one layer, up-sampled by 4, concatenated with layers lower down. 59 Tool Computer Vision News Focus on… The final lines of the code we’ll review are: line 496 which implements the 1x1 convolution by number of semantic classes involved, since we want the output size of the network to be such that every semantic class (for image segmentation) will get one layer; and line 497 where we upsample so as to arrive at an output size equal to the original image size:

RkJQdWJsaXNoZXIy NTc3NzU=