Computer Vision News

2 Summary Research 14 The core of the motion module is made of the estimation of the backward optical flow from a driving frame D to the source frame S. This is approximated by the first order Taylor expansion in a neighbourhood of keypoint locations. To do that, an abstract reference frame R is assumed, in order to calculate the transformations ffff and , considering the contribution of each frame X as locally bijective in the neighbourhood of each keypoint. T_(S and , together with the coefficient of the matrices d/dp T_(S←R) and d/dp T_(D←R) are predicted by the keypoint predictor, a standard U-Net architecture estimating heatmaps that can be interpreted as keypoint detection confidence maps. To estimate T_(S , the following is used:: (Eq. 1) For which the first order Taylor expansion is calculated. After having 1) estimated and aligned T ̂_( with the source frame S, by warping the latter according to local transformations derived from the Taylor expansion of Eq. 1.; 2) computed heatmaps H_k indicating where each transformation happens; and 3) concatenated and processed the previous two by a U-Net, the final dense motion field prediction T ̂_(S←Dis given by: (Eq. 2) ← = ← ∘ ← = ← ∘ ←−1 ̂ ← ( ) = 0 + ∑ ( ← ( ) + ( − ← ( ))) =1 ← ( )| = ← ( )| = ← ̂ ← ( ) ← ← ← ← ← ̂ ← ( ) Method

Computer Vision News - June 2020