Computer Vision News - August 2018

Every month, Computer Vision News reviews a research paper from our field. This month we have chosen to review: Taskonomy - Disentangling Task Transfer Learning . We are indebted to the authors ( Amir R. Zamir, Alexander Sax, William Shen, Leonidas Guibas, Jitendra Malik and Silvio Savarese ), for allowing us to use their images to illustrate this review. Their article is here . Introduction: Transfer learning is the re-training of a network pre-trained on one problem, for use on a different, related problem. This technique is gaining popularity in the deep learning field, because it allows the customization of deep learning networks to new problems using relatively small datasets. This is highly advantageous, as for most real-world problems, massive labeled training datasets are not available. The question arises: which visual task has the most affinity for your problem, and therefore a network trained on that task is most likely to have the best results when re-trained using transfer learning. This article tackles this question systematically. In this paper the authors propose a fully computational approach for modelling affinity between tasks within the visual task space. First, the authors computed a graph of affinities between 26 different kinds of visual tasks (detailed below). From this graph they produced a taxonomy mapping the quality of transfer- learning between each pair of visual tasks. Then, they analyzed the results in order to minimize the cases in which network training must start from scratch, reducing the dependence of researchers on large labelled datasets, and of course computation resources -- thus time and costs. Using their mapping, the authors demonstrated that for some tasks they found relevant transfer-learning candidates that when used would reduce the data required for training by 66% compared to training from scratch, while maintaining nearly equal performance. The authors published their code and important tools for public use. The authors also include 26 visual tasks in common use by the research community (24 of which are illustrated below). Some of these tasks have a clear, strong affinity -- such as Surface Normals and Euclidean Distance (the second is the derivative of the first), or the Vanishing Points in a Room, which are useful for Orientation. However, others are less clear -- for instance, can Keypoint Identification and Image Reshading be used in conjunction to arrive at Pose Estimation? And if so, how? 4 Research: Taskonomy Research by Assaf Spanier Computer Vision News This is highly advantageous, as for most real-world problems, massive labeled training datasets are not available.”