Computer Vision News - November 2019

Summary Wo en in Computer Vision 32 Why did you choose this specific area? When I was graduating from college and people asked me what I wanted to do, I said I wanted to do image retrieval. When I first started my PhD, I was interested in Professor Jan-Michael Frahm’s work on image geo-localization. I applied to the grad school and at that time Jan had a grant for that project so that’s how it started. I really like the topic and I really like the people in this area as well. They are so helpful and inspiring. When we meet at conferences like this, I always get great advice and new perspectives from them. "With Torsten’s advice, I was able to better understand my own paper and improve my thesis." What is the best advice that you have ever received? Torsten Sattler was on my committee and he provided critical feedback on my papers and thesis in general. One of the questions he asked me was about a paper I presented at CVPR 2017. The work was about using learned context for where to focus on images for geo-localization. For example, occlusions in urban scenes, like trees or pedestrians, prevent the correct localization of the image because they act as a distractor. My work tried to use the context to choose which region in the image the neural network should focus on. The question Torsten asked was, “The neural network naturally picks up where to focus on, so why is this attention necessary? Why is it performing better?” He encouraged me to think more deeply about it. When we compute the attention for this neural network, we use top- down knowledge that has more high-level information. For image matching, low-level details are really important because that’s what’s going to discriminate this image from all the others. There is a paper called ‘The Devil is in the Details’ about this. We use this detailed low-level information while using the high-level information