WACV 2024 Daily - Friday

Obtaining the training data was a challenge in itself. Instead of finetuning a pre-trained model, such as Stable Diffusion, Radim opted for an approach more suited to this specific deblurring and temporal super-resolution application. He didn’t want to lose detail by putting images into a lower-dimensional feature space, which often happens when using a variational autoencoder. “We trained the denoising model in the full input resolution,” he reveals. “For that, we needed a lot of data. We generated our training datasets with a computer rendering program called Blender. In Blender, we loaded a lot of already existing 3D models and rendered them with different textures. Then, we rendered them as if they were captured by a high-speed camera. We got 24 images of these fastmoving objects, and the object was more or less sharp in every consecutive image. We then simulated the fast-moving object blur by averaging the 24 images.” The task generated tens of thousands of images, and, with this vast amount of data, Radim trained the denoising diffusion probabilistic model and was pleased to discover that it generalized effectively to real-world data. Looking ahead, he is already discussing potential future directions for the work and remains optimistic about further refining its approach. One avenue involves improving single-image temporal super-resolution, aiming to match the performance of multi-frame methods that require more than one input image. 8 DAILY WACV Friday Poster Presentation

RkJQdWJsaXNoZXIy NTc3NzU=