Computer Vision News - February 2024

11 Computer Vision News In recent years, image compression has seen significant advancements, with neural network-driven techniques gaining more attention. Several works have employed deep generative methods like GANs and diffusion models to improve perceptual quality or realism. However, optimizing models for different bit rates remains a key challenge. “In image compression with deep learning, most models are optimized for a single target bit rate,” Shoma begins. “In other words, we need to train multiple models to compress images into different bit rates. Enhancing the perceptual quality of compressed images is another problem, especially when we compress images to a very small data size. In that case, a lot of information is lost.” Although there are existing methods to tackle these issues individually, very few studies address both, which was the motivation behind this work. Its proposed variable-rate GAN-based approach places a key emphasis on the discriminator’s role in training. Shoma explains he experimented with various discriminator designs to identify the one most suitable for the task and, additionally, introduced a novel adversarial loss function. “We show that these two methods improve performance and bridge the gap between the state-of-theart and this high-controllability Shoma Iwai is a second-year PhD student at Tohoku University in Miyagi Prefecture, Japan. His paper proposes a single solution to two common challenges in image compression with deep learning. He spoke to us ahead of his oral presentation at AWCV 2024 Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model Shoma Iwai BEST OF WACV 2024

Made with FlippingBook

RkJQdWJsaXNoZXIy NTc3NzU=