Computer Vision News - March 2020

2 Summary Research 4 by Amnon Geifman Traditional deep learning models use heavy over-parametrization to enable robust convergence during training. However, this complicate the use of networks in production systems, due to their high memory and computation consumption. In recent years, many researches try to shrink the model size in order to enable fast execution on any device of any size. The most well-known methods for this task are quantization and pruning . Quantization solutions suggest using low precision weights during the inference time. To this end, each weight is mapped to its nearest bin where the bins are determined by a discrete set of values. The roughness of the quantization can be from fine levels of 32-bit per weight to a coarse level of binarized network, where each weight is assigned with {+1,-1}. On the other hand, pruning means removing uninformative weights/filters from the network . This is done by using different kinds of metrics to discover redundant filters, channels or blocks and remove them. These two methods however cannot be applied together . Most of existing pruning strategies operate on full precision and cannot be directly applied on discrete parameter set after quantization. Today's paper tackles the problem of applying both techniques of quantization and pruning in a nearly optimal way. In particular, the authors suggest a method to prune redundant low precision filters. They also propose Automatic Pruning for Quantized Neural Networks Every month, Computer Vision News reviews a research paper from our field. This month we have chosen Automatic Pruning for Quantized Neural Networks. This is the last contribution by Amnon Geifman to our magazine, after more than 20 articles written with much talent and passion. We are grateful to Amnon for his outstanding work and wish him tons of success. We also wish the best of luck to our new Engineering Editors: Marica Muffoletto and Ioannis Valasakis. Read their first articles further on this issue of Computer Vision News. "The paper proposes a method to effectively pruning redundant filters with low-precision"

RkJQdWJsaXNoZXIy NTc3NzU=