ICCV Daily 2019 - Thursday

MICCAI 2019 DAILY 13 Antonino Furnari DA I L Y and inference. We have to separate the components to do this, so we observed that by using two separate LSTMs rather than just one we obtained a good improvement. A second major component is we propose a pre-training technique where you encourage the model to separate these two tasks with what we call sequence completion pre- training. During this pre-training the decoding component will ‘cheat’ and look into the future. The third major point is a module which allows you combined multimodal predictions from RGB frames, optical flows and objects, using what we call modality attention.” Antonino says they have been inspired by various works in the past that have used encoder-decoder models for sequence-to-sequence translations. In those models there were two LSTMs: one for reading the sentence and another one for generating another sentence. They used these components but, of course, changed many things. Their idea is that you need to observe the scene and then guess what is going to happen. The research has been driven by their interest as a laboratory in creating applications for the industrial world . Giovanni Maria tells us: “It’s worth noting the impact that first-person vision will have for society, especially in industry and for assistive technologies. More specifically anticipating actions and interactions with objects will help workers and increase safety in workplaces.” A practical future use might be for a factory worker to wear a device on their head which can anticipate what they’re going to do in advance. If, for example, they were about to press a button they shouldn’t be pressing, the device could send them an alert or suggestion to stop it happening. Improving the general performance of the model, and more explicitly considering the relations between the objects, are two directions that they would like to take this work in the future. To find out more about Antonino and Giovanni’s work, visit their oral and poster at 14:24 today in Hall D2.

RkJQdWJsaXNoZXIy NTc3NzU=