WACV 2026 Daily - Monday

5 DAILY WACV Monday Changlin Song Campbell. “He asked me, what can we do with an object that is out of frame?” Changlin recalls. “There are so many works that are trying to improve the accuracy of things within the frame, but there’s not much we can do about things outside.” One solution would be to use generative models to expand the image and then perform detection. “Generative models are definitely a good way,” Changlin says, “but it’s very slow.” For applications where detection must occur instantaneously, particularly in safety-related contexts, that latency is a serious limitation. That led him to a different line of thinking. “Actually, we don’t need the pixel information,” he explains. “Generative models will produce that pixel information, so can we try a lightweight method that only focuses on the things that we care about?” The technical challenge quickly became clear. Expanding the frame in a model’s latent space substantially increases the region’s size. “It’s like exponentially expanding the region we are going to detect,” Changlin tells us. “You are predicting a lot from a little.” At the same time, faces, like many objects, are sparsely distributed. Searching the entire expanded region at full resolution would introduce heavy computational overhead. The team therefore designed

Made with FlippingBook

RkJQdWJsaXNoZXIy NTc3NzU=