CVPR Daily - Tuesday

I wouldn't call it a breakthrough, but I do feel this idea of how we should think about computer vision problems, how computer vision modules should be designed in a robotic system. I think that's something that's kind of new, and we helped to develop. For example, very simple robotics has grasping objects. It used to be divided into different components, like perception. Computer vision algorithms produced object poses, and then the robot would plan the action. So that's how we started to approach the robotics problem. That's how we thought about it. Later, we realized that for a lot of robotics actions. For example, grasping, you don't really need to go through this modular approach. Instead, you can predict something we call action affordance. So basically, just telling you which area is a good place to grasp. That kind of formulation gets rid of this intermediate post-estimation task, and therefore the whole thing becomes more end-to-end and also more robust. What is the direction of this study that excites you the most for the coming years? In computer vision, it's in the intersection with other areas that will mostly produce the most exciting work. So, for example, computer vision now is interacting also with audio analysis. Also, for example, robotics, of course. That's what I'm working on. Computer vision is really expanding. With NLP for instance? Yeah, for NLP, that's a very good example. I think all these intersections with other areas are nice. And you want to be part of it? Yeah, of course. In which area do you think you’ll make the biggest impact? One thing that we are constantly thinking about is how we can combine, for example, language, vision, and robotics. I think there's like an intersection of these three areas. Like a convergence? Yes, so how you can use natural language as a way to either give instructions to the robot and also as a way for the robot to plan their action. All this planning needs to be grounded in computer vision or their image input and then plan their action. So I think the intersection of vision, language, and robotics is something very interesting. We haven't had a lot of work on it yet, but 26 DAILY CVPR Women in Computer Vision Tuesday

RkJQdWJsaXNoZXIy NTc3NzU=