Computer Vision News - January 2022
        
 12 Best Paper Award BMVC that as soon as you have a task, you can ask yourself a couple of questions and if the answers to those questions are yes, our model will significantly improve the quality of your technology. You can use it as a classifier, or as a tracker if you want to make it faster, or for geometry. You can use it where you have a lack of labeled data, or a lack of data in general, and you want to generalize to the data you do have. We want our models to be universal. ” Artem adds: “ We started with a problem that no one was paying much attention to, and wasn’t described very well, and we showed it’s very important. If you solve this problem, then you have better performance in a wide variety of tasks. ” What features would they add to the model if they could? Being a big music fan, Ivan has a fun idea. “ Signals are discretized, they’re not continuous, and they’re discretized in exactly the same way images are discretized, ” he tells us. “ But images are in two dimensions and sounds are in one dimension, and when we analyze sound, some problems arise which are not common for images. In image analysis we use small windows of three or five pixels, but for audio analysis, the windows are much wider. I would try to show if it is applicable to sound, and if so, how to use it for sound analysis and generation. When I say one word very fast, and when I say the same word slow, it’s still the same word, so we want the result to be the same. For sound, the scale is the duration. We could demonstrate that you can use the same type of kernels in 1D and the same kernels in 2D and generalize it to 3D. It gives us some estimate of how to increase the number of dimensions when
        
         Made with FlippingBook 
RkJQdWJsaXNoZXIy NTc3NzU=