Computer Vision News

corresponding layers’ feature maps in the up-sampling pathway (Figure 1): Figure 1: U-net architecture (source: Ronneberger et. al, 2015). Right branch- encoder, left branch- decoder. Each blue box corresponds to a multi-channel feature map, white boxes represent copied feature maps. The proposed MRN is based on the classic U-Net architecture with additional encoders corresponding to the different resolutions (Figure 2 on the next page). The different input resolutions share central coordinates and cover tissue area in a pyramid-like structure (Figure 3 on the next page). The input shapes of all resolutions are identical and processed using identically structured encoders. To preserve a relevant information for the region of interest, the lower resolution is center cropped and upsampled to the original resolution. The outputs are concatenated with the high resolution convoluted feature maps and are passed through 1X1 convolution layer with an identity activation function. This process allows a weighted summation of the multi resolution feature maps into a single feature map which is concatenated in turn to the corresponding layer in the decoder. This architecture allows the use of peripheral low-resolution data during a pixel- wise segmentation of high-resolution areas. Similarly to the classic U-Net, this is done using a single network, relatively small number of parameters and a single loss function. This approach yields superior results comparing to those gained using a single resolution by the classic U-Net. Computer Vision News 11 Research Computer Vision News Multi-Resolution Network

Computer Vision News - June 2019