Computer Vision News - December 2023

Computer Vision News 4 Elisabetta and Ayca are giving a live demo of OpenMask3D at the Google Research Booth at ICCV 2023 in Paris, France. These per-mask features can be used to compute the similarity between each object instance and a given text query embedded in the CLIP space. This enables our model to respond to open-vocabulary queries in an instance-based manner. For example, the query “watch a movie” obtains the highest embedding similarity with the TV object. Thanks to its zero-shot learning capability, OpenMask3D is able to segment instances of a given query object that might not be present in common segmentation datasets, such as “Pepsi”, “angel” and “dollhouse”. Our approach also preserves information about object properties such as affordance, color, geometry, material, and state, resulting in strong openvocabulary 3D instance segmentation capabilities. This opens up new possibilities for understanding and interacting with 3D scenes in a more comprehensive and flexible manner. We encourage the research community to explore open-vocabulary approaches, where knowledge from different modalities can be seamlessly integrated into a unified and coherent space. Check out our project website (https://openmask3d.github.io)! Try OpenMask3D on your own scenes and let us know the most interesting object you are able to identify with OpenMask3D! Do you want to learn more about OpenMask3D? Visit Ayca and Elisabetta during their NeurIPS poster session on Tue 12 Dec between 10:45 am - 12:45 pm CST, Great Hall & Hall B1+B2, poster #906! NeurIPS 2023 Accepted Paper

RkJQdWJsaXNoZXIy NTc3NzU=