WACV 2025 Daily - Saturday

Diffusion models are a highly effective tool for image generation, often producing photorealistic results when conditioned on specific characteristics. However, guiding a pre-trained diffusion model to generate realistic images from a class not included in the original conditioning process remains a challenging problem. Although successful, previous solutions, such as the ADM-G guiding approach, provide minimal guidance during the final stage of the denoising process, leading to lower-quality outputs. To address this, the team proposes GeoGuide, a novel tool designed to guide diffusion models effectively. “It’s not obvious how to generate high-quality images since directly sampling elements can produce results that are not exactly in the manifold of the data,” Przemysław explains. Jacek elaborates: “When you look at the details, for example, you can have a dog with five legs or two heads! Diffusion models sometimes switch off from realistic situations to something strange. We wanted to understand why that was happening and how to force the model to produce results in the true data manifold, which satisfies the prompt!” The team proposes a guidance strategy using a neural network to 4 DAILY WACV Saturday Oral Presentation Przemysław Spurek and Jacek Tabor are professors at the Jagiellonian University in Krakow. Alongside first author Mateusz Poleski (who prepared this publication as part of his master's degree thesis), they are co-authors of a fascinating paper accepted as a poster and oral this year and are here to speak to us about their work GeoGuide: Geometric guidance of diffusion models Jacek Tabor Przemysław Spurek Mateusz Poleski

RkJQdWJsaXNoZXIy NTc3NzU=