CVPR Daily - 2018

Thursday 5 highlights. We have deployed systems already at the Masters Golf event, Wimbledon, US Open. We are using computer vision combined with other modalities – sound, speech, language, and so on – to understand this kind of content better. The value of doing that is that it can create much more personalized products for the audience and draw their attention to those exciting moments. On the one hand, we will see more on applications, but on the other there is the potential for putting these pieces together to go after new problems. This was a big week for IBM because we unveiled a new system called Debater. It is not vision focused at the moment, it’s more about language, but it’s really a tremendous advance because it is about a system that can study an important topic deeply and find ways to argue pro and con about that particular topic. It’s called Debater for a reason. Think of humans debating; here is a computer participating in a debate. Can you tell us about some of the things that you are demonstrating at CVPR? As we think about vision and the requirements, not only you want systems that are accurate, of course that can recognize what you want them to, but speed, bandwidth, power, low power – efficiency is something that is very important – all of these things are very important. One of the things that we are showing here is the development of a neuromorphic chip. What I mean by that is a unique system architecture that uses a spiking network to communicate. We can train this using traditional deep learning methods. We can train a convolutional neural net, we can use a tool like Caffe to do that, but then essentially this network is transformed and compiled for this particular neuromorphic chip. One of the demos we are showing here is pretty awesome. It is gesture recognition and it is real time, but everything is happening on this chip and it is extremely low power. We are talking milliwatts of power that it is able to perform this classification of gestures. The way this chip essentially communicates is using spikes. This is a very different approach compared to a traditional CPU, which might use 32-bit or 64-bit architecture, floating point precision and so on. The reason this is called neuromorphic is because it is John R. Smith - IBM John R. Smith showcasing a real-time demo of a low power, high throughput, fully event-based stereo system. This gesture recognition work is authored by Alexander Andreopoulos, Hirak J. Kashyap, Tapan K. Nayak, Arnon Amir and Myron D. Flickner.

CVPR Daily - 2018 - Thursday