Previous Page  46 / 56 Next Page
Information
Show Menu
Previous Page 46 / 56 Next Page
Page Background

Namdar Homayounfar

is a PhD student at the

department of Statistical Sciences at the

University

of Toronto

and is also currently interning at

UBER

.

The goal of their work, Namdar told us,

is to localise a sports field in an image.

Doing this is the first step in data

generation in sports, because based on

knowing where the field is, the players

can be localised and more statistics

about them can be extracted - like how

much they run, which positions they

occupy or to identify offsides.

The novelty of this work is the way

they formulated and solved the

problem: they are the first to use

single-image (monocular) inputs, and

their approach is fully automatic, very

fast and exact.

Originally, Namdar was trying to solve

a different problem - to automatically

generate captions and statistics about

the players. But soon he realised that

in order to do so, they first need to

localise the field. He tried using

existing methods, but after a few

months had to conclude that they did

not perform well enough. “

This

problem could be solved if we had four

points from the image and four points

from the model, and we tried to

estimate the homography matrix

directly

”, he explains. But figuring out which

four points in the image correspond to

which four points in the model turns

out to be a very difficult problem.

about it like where the grass is, where

the lines of the field are, or where the

outside of the field is. It is very hard to

do this using handmade heuristics, due

to the variations between different

fields. Therefore, they came up with a

new machine learning method to solve

all of these problems of field

localisation.

They use a neural network model that

is able to tell them per pixel what

exactly it contains, and thus solves the

problem of both field localisation and

answering more specific questions.

This predictions are done per image,

and Namdar tells us that in the future,

they want to do this in a temporal

manner, for videos instead of single

images, incorporating temporal priors

so that there is a smooth transition of

homographies between the frames.

Namdar’s supervisors and co-authors,

Raquel Urtasun

and

Sanja Fidler

(see

pages 10-14 of this magazine) followed

attentively our discussion; when asked

to mention the main attractiveness of

this work, Raquel told us that “

the key

of this work is to come up with a

parameterisation of the problem that

allows you to do efficient inference by

taking into account the structure of the

problem and the advantages of

convolutional neural networks and

deep learning

”.

Namdar Homayounfar

46

Monday

Sports Field Localization via Deep Structured Models

Besides just

localising the

sports field,

they wanted

to also have

additional

information

BEST OF CVPR