CVPR Daily - Tuesday

Matthias Hein is a professor at the University of Tübingen in Germany. He speaks to us ahead of his oral and poster today: Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem . Matthias tells us that this work proves that so-called ReLU networks, which are basically all existing networks which are out there using ReLU activation function or max pooling or average pooling for convolution layers, produce arbitrarily high-confidence predictions far away from the training data. This is clearly a problem in safety-critical applications because the confidence of a classifier cannot be used for triggering human intervention. He proposes a new training method, adversarial confidence enhanced training – inspired by adversarial training – to at least mitigate the effect of this. This generates noise images then checks in a certain neighborhood around this noise image for the maximum confidence. In the training process, they feed this back into the neural network and say, on this point, please produce a uniform confidence. Matthias thinks that, to some extent, with neural networks we are in the wild wild west. He wants to bring some kind of rigour back to the deep learning world. One property that is typically expected from a classifier is that it should know when it does not know. Some classifiers have that property, even provably, but neural networks obviously don’t have it and he believes this should be widely known. He tells us that he’s had this theory for a couple of years now. He didn’t think it was interesting because he thought people already knew it, but then when he occasionally told it to people, they were surprised, and he realised he should write it up. Matthias explains: “ I’m more from the machine learning community. I’m basically an outsider to computer vision. This method which mitigates at least this problem is inspired by adversarial training, by this robust optimization perspective, and I think this is a very valuable perspective because it’s tackling the worst case. We also propose in this paper another technique where we take noise images and say, on this, produce uniform confidence. The problem is then we can still adversarially look in a neighbourhood for the instance which produced high confidence and we are always successful. This simple method does not work. You need this kind of robust optimization technique in order to solve the problem .” Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data… 8 DAILY CVPR Tuesday Presentation