Gemma Canet Tarrés just finished her PhD at the University of Surrey, where she worked on improving controllability in image generation models under the supervision of John Collomosse and Andrew Gilbert. She defended her thesis a couple of weeks ago. You will find a short review of her thesis just after this article. During her PhD, she completed two internships at Adobe and is currently interning at Amazon. She is now in the market for new exciting opportunities where she can continue learning and contributing to the field of generative AI. She’s a catch, so take her before it’s too late! Multitwine: Multi-Object Compositing with Text and Layout Control 8 DAILY CVPR Friday Highlight Presentation This paper is the first one, as far as Gemma is aware, that does multiple object compositing at the same time. And additionally, you also add layout and text control for added controllability over the final image. But why do we need this? So far there's a lot of work on multiple objects subject-driven generation, but for object compositing, which is a very important task in many editing pipelines, there's nothing on multiple objects at the same time. And sometimes if we sequentially add different objects, there's some things that are very hard to do or almost impossible to do, like adding two people hugging that needs reposing of both people at the same time, or adding a person walking a dog that is the person, the dog and also the leash. And there are some interactions that are very hard to do with sequential one object compositing. Just providing a way to simultaneously add multiple objects is an important contribution. It is challenging because it has all the challenges from object compositing that we have with a single object at the same time. You have to reharmonize the object, blend the
RkJQdWJsaXNoZXIy NTc3NzU=