Computer Vision News - November 2022

19 Angela Dai perception that we, as people, do? Were all these developments expected then, or some of these fields or subfields came as a surprise for you? There were definitely some developments that came as a surprise. In hindsight, they made sense, and when they came out, they were a surprise. But that's a sign of good research! Something that a lot of people were doing in this area is that they came and developed, for instance, the first thing probably was these neural coordinate field implicit representations for representing 3D shape geometry. They were very, very effective. They made a lot of sense, and they go back and have ties to traditional geometric representations and implicit representations. That was quite powerful. It's not a perfect representation. It's still a bit of a challenge to see what's the proper way to represent a large scene and not just one object. But, this was cool. Probably everybody knows about NeRF and all of the amazing stuff that you can do with NeRF. And that's, of course, a huge development, particularly from the sort of photorealistic generation side. I think more things have changed since the last time. I was doing my PhD in Stanford and completed it at the end of 2018. I then moved to the Technical University of Munich, where they had these nice opportunities that you could apply for what they called junior research group positions. You can basically apply for funding for yourself and two students, which is actually quite nice. This presents a lot of opportunities to start building up this kind of research group of your own early on, prior to even becoming a professor. So Angela, some of our readers may remember you from our previous interview in 2017 in Hawaii. Can you share what happened in the last five years? The last time I had the pleasure to speak with you, this was about the work we had just done on ScanNet. That was really to build up a database that was available to the community that allowed people to get access to many examples of geometric reconstructions of indoor environments and also their semantics. So, of course, the stuff that I've done since then is trying to movetowardshowwecanactuallyperceive real-world environments, typically indoor environments, from commodity kinds of data. So from an image, from an RGBD sensor, how can we get out the complete geometry of that environment? How can we understand the individual objects that are observed there, even though they're not seen perfectly? The data is limited. It's imperfect. How can we sort of imbue machines with the same kind of 3D