13 DAILY CVPR Friday independent yet complementary control of each modality. Using a novel training strategy based on auxiliary search-driven triplets, PARASOL introduces precise style manipulation while preserving content integrity. Expanding to conditioning on exemplars, the next model, Thinking Outside the BBox (ECCV 2024) addresses the novel challenge of 'unconstrained generative object compositing'. This task involves seamlessly integrating objects into background images without requiring explicit positional guidance. By training a diffusionbased model on paired synthetic data, the approach autonomously handles tasks such as object placement, scaling, lighting harmonization, and generating realistic effects like shadows and reflections. Notably, the model explores diverse, natural placements when no positional input is provided, enabling flexibility and accelerating workflows. This solution surpasses existing methods in realism and user satisfaction, setting a new standard for generative compositing. Finally, Gemma’s thesis culminates in Multitwine (CVPR 2025), a model for simultaneous multiobject compositing, combining text, layout, and exemplar-based inputs. – For more information about this model, see pages 8-11 or go ask Gemma directly at her poster session today! Together, these different approaches form a cohesive framework for controllable image generation, addressing challenges in structural, stylistic, and compositional control. By leveraging diverse input modalities, the generation space is narrowed, producing outputs more closely aligned with the inputs and unlocking greater precision and new creative possibilities. Congrats, Doctor Gemma! Gemma Canet Tarrés
RkJQdWJsaXNoZXIy NTc3NzU=