Computer Vision News - May 2018

Accuracy: Bn Ops -- billions of operations; BFLOP/s -- billion floating point operations per second; and FPS -- frames per second. Candidly calling themselves lazy, the authors appended their results to a graph from He et al’s paper -- displaying speed/accuracy tradeoff on the mAP at .5 IOU metric. One can see that the results are impressive and fast. With admirable sincerity the authors list the things they tried that did not improve results: 1. Anchor box x, y offset predictions where they try to predict the x, y offset as a multiple of the box width or height using a linear activation. It didn’t work very well. 2. Linear x, y predictions instead of logistic -- led to a couple point drop in mAP. 3. Focal loss. They tried using focal loss, again resulting in a couple point drop in mAP. They hypothesize YOLOv3 may already be robust to the problem focal loss attempts to solve as it has a separate score for objectness prediction and class prediction. 4. Dual IOU thresholds as used in the Faster RCNN -- with an overlap of 0.7 predicted as a positive example and 0.3−0.7 ignored, while overlap lower than 0.3 is marked as a negative example. Didn’t get good results. Summing up with: 5. YOLOv3 is a good detector. It’s fast, it’s accurate. It’s not as great on the COCO average AP between 0.5 and 0.95 IOU metric. But it’s very good on the old detection metric of 0.5 IOU. A somewhat surreal paper, poking honest fun at academic establishment, with a between-the-lines criticism of papers in the field of deep learning. The candidness is evident from the beginning, with the opening quote: “ We mostly took good ideas from other people ” Run out and read it ! Computer Vision News Research 5 Research Computer Vision News

RkJQdWJsaXNoZXIy NTc3NzU=