Daily CVPR - Tuesday

Lisa Anne Hendricks (University of California, Berkeley) presented the work conducted by a team of six: Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data. The main goal of the project is to generate sentences which combine different visual elements in ways that are not seen in training descriptions. They think this is an important problem because it is hard to collect description data and we already have so much image data and text data that we can leverage for the description task. 4 CVPR Daily: Tuesday Highlights Many computational models of visual attention predict eye fixation locations as saliency maps. Anna Volokitin presented hers and two teammate’s work, Predicting When Saliency Maps are Accurate and Eye Fixations Consistent , in which they introduce two measures that enrich the information provided by saliency models. Namely, they can estimate the reliability of a saliency model from the raw image, which serves as a confidence measure that may be used to select the best saliency algorithm for an image. Analogously, the consistency of the eye fixations among subjects, i.e. the agreement between the eye fixation locations of different subjects, can also be predicted and used by a designer to assess whether subjects reach a consensus about salient image locations.