Архитектура Аудит Военная наука Иностранные языки Медицина Металлургия Метрология
Образование Политология Производство Психология Стандартизация Технологии


First-Order Least Squares Solution



=
Instead of a single point, we can use a small neighborhood to deter- mine the optical fl ow. We assume that the optical fl ow is constant in this region and discuss in this section under which conditions an unam- biguous determination of the optical fl ow is possible. We still have the two unknowns f [f1, f2]T, but we also have the continuity constraint Eq. (14.13) for the optical fl ow at many points. Thus we generally end up with an overdetermined equation system. Such a system cannot be solved exactly but only by minimizing an error functional. We seek a so- lution that minimizes Eq. (14.13) within a local neighborhood in a least squares sense. Thus, the convolution integral

2
2
e ⊗ = ∫ w( x x ', t − t').f g ( x ') + f g ( x ') + g ( x ')Σ d2x'dt'


2                                                                       1  x                       2  y                       t

− ∞


 

(14.15)


⊗ ⊗
=
should be minimized. Note that f [f1, f2]T is constant within the local neighborhood. It depends, of course, as e , on x. For the sake of more compact equations, we omit the explicit dependency of gx, gy, and gt on the variable x ' in the following equations. The partial derivative

∂ g/∂ p is abbreviated by gp.

In this integral, the square of the residual deviation from the conti- nuity constraint is summed up over a region determined by the size of the window function w. In order to simplify the equations further, we use the following abbreviation for this weighted averaging procedure:

2
2
e ⊗ 2 =.f1gx + f2gy + gtΣ   .                                      (14.16)

The window function w determines the size of the neighborhood. This makes the least-squares approach very fl exible. The averaging in Eq. (14.15) can be but must not be extended in the temporal direction. If we choose a rectangular neighborhood with constant weighting for all points, we end up with a simple block matching technique. This corre- sponds to an averaging with a box fi lter. However, because of the bad averaging properties of box fi lters (Section 11.3), an averaging with a weighting function that decreases with the distance of the point [ x, t]T from [ x, t]T appears to be a more suitable approach. In continuous space, averaging with a Gaussian fi lter is a good choice. For discrete images, averaging with a binomial fi lter is most suitable (Section 11.4).

Equation (14.16) can be solved by setting the partial derivatives

2                                                                                 


∂ ⊗ e 2

 

∂ f1

2
∂ ⊗ e ⊗ 2

 

∂ f2


=  2gx.f1gx + f2gy + gtΣ  =!   0,

= 2gy.f1gx + f2gy + gtΣ  =!   0


 

(14.17)


14.3 First-Order Diff erential Methods†                                                            393

to zero. From this condition we obtain the linear equation system


 gxgx gxgy

 


 Σ f1


Σ = − 


gxgt

 


 ,             (14.18)


 gxgy gygy          f2


 gygt 


or more compact in matrix notation

Gf = g.                                                  (14.19)

The terms gpgq represent regularized estimates that are composed of convolution and nonlinear point operations. In the operator notation, we can replace it by

B(Dp · Dq),                                                 (14.20)

B
D
where p is a suitable discrete fi rst-order derivative operator in the di- rection p (Chapter 12) and an averaging operator (Chapter 11). Thus, the operator expression in Eq. (14.20) includes the following sequence of image processing operators:

1.

D     D
Apply the convolution operators          p and   q to the image to obtain images with the fi rst-order derivatives in directions p and q.

2. Multiply the two derivative images pointwise.

3. Convolve the resulting image with the averaging mask B.

Note that the point operation is a nonlinear operation. Therefore, it must not be interchanged with the averaging.

The linear equation system Eq. (14.18) can be solved if the matrix can be inverted. This is the case when the determinant of the matrix is not zero:

det G = gxgx gygy − gxgy 2 ≠ 0.                                      (14.21)

From this equation, we can deduce two conditions that must be met:

1. Not all partial derivatives gx and gy must be zero. In other words, the neighborhood must not consist of an area with constant gray values.

2. The gradients in the neighborhood must not point in the same direc- tion. If this were the case, we could express gy by gx except for a constant factor and the determinant of G in Eq. (14.21) would vanish.

The solution for the optical fl ow f can be written down explicitly because it is easy to invert the 2 × 2 matix G:


G − 1 =  1  Σ     gygy − gxgy

 

                


Σ if det G ≠ 0.            (14.22)


det G   − gxgy              gxgx

Σ
With f = G − 1 g we then obtain


Σ
Σ =−
f1               1

f2                   det G


gxgt gygy − gygt gxgy gygt gxgx − gxgt gxgy


Σ .            (14.23)


394                                                                                                                  14 Motion

 

The solution looks still quite complex. It can be simplifi ed considerably by observing that G is a symmetric matrix. Any symmetric matrix can be brought into diagonal form by a rotation of the coordinate system into the so-called principle-axes coordinate system. Then the matrix G ' reduces to


G' =
  gx' gx'                 0

0      gy' gy'

 


 ,                            (14.24)


the determinant det G ' = gx' gx' gy' gy', and the optical fl ow is


 

Σ
f1'

f2'


Σ = − 

 


gx ' gt

gx' gx'

gy ' gt

gy' gy'


 .                           (14.25)


This equation refl ects in a quantitative way the qualitative discussion about the aperture problem discussed in Section 14.2.2. The principal axes are oriented along the directions of the maximum and minimum mean square spatial gray value changes, which are perpendicular to each other. Because the matrix G is diagonal, both changes are uncorrelated. Now, we can distinguish three cases:

1. gx' gx' > 0, gy' gy' > 0: spatial gray value changes in all directions. Then both components of the optical fl ow can be determined.

2.

=
gx' gx' > 0, gy' gy' 0: spatial gray value changes only in x' direction (perpendicularly to an edge). Then only the component of the opti- cal fl ow in x' direction can be determined (aperture problem). The component of the optical fl ow parallel to the edge remains unknown.

3.

=        =
gx' gx' gy' gy' 0: no spatial gray value changes in both directions. In the case of a constant region, none of the components of the optical fl ow can be determined at all.

It is important to note that only the matrix G determines the type of solution of the least-squares approach. In this matrix only spatial and no temporal derivatives occur. This means that the spatial derivatives and thus the spatial structure of the image entirely determines whether and how accurately the optical fl ow can be estimated.

 































Error Analysis

Noise may introduce a systematic error into the estimate of the optical fl ow. Here we show how we can analyze the infl uence of noise on the determination of optical fl ow in a very general way. We assume that the image signal is composed of a structure moving with a constant velocity u superimposed by zero-mean isotropic noise:

g'( x, t) = g( x u t) + n( x, t).                                      (14.26)


14.3 First-Order Diff erential Methods†                                                            395

 

−         −
This is a very general approach because we do not rely on any specifi c form of the gray value structure. The expression g( x u t) just says that an arbitrary spatial structure is moving with a constant velocity u. In this way, a function with three parameters g(x1, x2, t) is reduced to a function with only two parameters g(x1 u1t, x2 u2t). We further assume that the partial derivatives of the noise function are not corre- lated with themselves or the partial derivatives of the image patterns.

Therefore we use the conditions

n
n = 0,   npnq = σ 2δ p− q,           gpnq = 0,                      (14.27)

 

and the partial derivatives are

g' = g + n gt' = − u g + ∂ tnt.                                 (14.28)

These conditions result in the optical fl ow estimate

         
   

f = u ( g gT + n nT )− 1 g gT .                                (14.29)

∇ ∇
∇ ∇
The key to understanding this matrix equation is to observe that the noise matrix n nT is diagonal in any coordinate system, because of the conditions set by Eq. (14.27). Therefore, we can transform the equa- tion into the principal-axes coordinate system in which g gT is diag- onal. Then we obtain


n
=
f  u Σ gx' 2 + σ 2


0         −     gx' 2           0

1
Σ
Σ
Σ .
2         2                                                 2

                                                                                                              


0        gy'


+ σ n


0  gy'


 

When the variance of the noise is not zero, the inverse of the fi rst matrix always exists and we obtain

gx' 2                             0       


n
f = u


gx' 2 + σ 2                             

g
' 2 + σ 2
0          gy' 2


 .                  (14.30)

 


      

 

This equation shows that the estimate of the optical fl ow is biased to- wards lower values. If the variance of the noise is about the squared magnitude of the gradient, the estimated values are only about half of the true values. Thus the diff erential method is an example of a non- robust technique because it deteriorates in noisy image sequences.

If the noise is negligible, however, the estimate of the optical fl ow is correct. This result is in contradiction to the widespread claim that dif- ferential methods do not deliver accurate results if the spatial gray value structure cannot be adequately approximated by a fi rst-order Taylor se- ries (see, for example, [172]). Kearney et al. [94], for instance, provided


396                                                                                                                  14 Motion

 

an error analysis of the gradient approach and concluded that it gives erroneous results as soon as second-order spatial derivatives become signifi cant.

These contradictory fi ndings resolve if we analyze the additional er- rors in the estimation of optical fl ow that are introduced by an inad- equate discretization of the partial derivative operators (see the dis- cussion on optimal derivative fi lters in Section 12.3). The error in the optical fl ow estimate is directly related to the error in the direction of discrete gradient operators (compare also the discussion on orientation estimates in Section 13.3.6). Therefore accurate optical fl ow estimates require carefully optimized derivative operators such as the optimized regularized gradient operators discussed in Section 12.5.5.

 

14.3.4 Spatiotemporal Energy Models‡

Models using Gabor-like quadrature fi lters (Section 13.4.5) are common in bi- ological vision. They are the basis for so-called spatiotemporal energy models [1, 2, 67]. This term can easily be misunderstood. It is not the kinetic energy of the moving objects that is referred to but the energy (squared amplitude) of a signal at the sensor in a certain k ω interval. Here we want to compare this type of model with the diff erential method discussed previously.

R L      S
One of the simplest models for 1-D motion vision uses just three quadrature fi lters. This set of directional fi lters detects objects moving to the right or to the left, and those which are not moving. We denote the squared magnitude of these quadrature operators by, , and. Then we can obtain an estimate of the 1-Doptical fl ow by using the operator [1, 2]:

S
U= R− L.                           (14.31)

An interesting interconnection of this approach with the diff erential method (Section 14.3.2) can be found, so that the diff erential method can also be under- stood as an energy extraction method.

We perform this comparison here for the analysis of 1-D motion, i. e., in a 2-D space-time image. In this case, the solution of the diff erential method can be written in operator notation according to Eq. (14.25) as

U =−
'          Bxt (Dt · Dx) .                                         (14.32)

Bxt (Dx · Dx)

B
We rewrite this equation with a slight modifi cation to smooth the images with the binomial mask xt before we apply the derivative operators, i. e., we use a regularized derivative operator (Section 12.5):

U =−
'          Bxt [(DtBxt) · (DxBxt)].                                  (14.33)

Bxt [(DxBxt) · (DxBxt)]

B
Smoothing with xt means a regularization of the derivative operator. The indices xt indicate that the smoothing is performed along both the temporal and spatial axes. Using the operator identity

4
1

AB =    Σ (A+ B)2 − (A− B)2Σ                                    (14.34)


14.3 First-Order Diff erential Methods†                                                           397

A                                                                    b


1

w

0.5


 

1

w

0.5


 


-1

-0.5

0


 

 

0.5

k

c


1 -1


0

 

 

-0.5


-1

-0.5

0


 

 

0.5

k


 

1 -1


0

 

 

-0.5


 

 

1

w

0.5

 


-1

-0.5

0


 

 

0.5

k


 

1 -1


0

 

 

-0.5


Figure 14.13: Transfer functions for the convolution operators Eq. (14.35) to detect objects moving right or left, or at rest: a R', b L', and c S'.

 

and the abbreviations

R' = (Dx + Dt)Bxt,            L' = (Dx − Dt)Bxt,            S' = 2DxBxt,                (14.35)

we can rewrite Eq. (14.33) and obtain a very similar expression as Eq. (14.31):

U =
' Bxt(R' · R' − L' · L').                                       (14.36)

Bxt(S' · S')

B
R  L         S
The fi lters ', ', and ' are regularized derivative fi lters. The transfer func- tions show that objects moving to the right, to the left, and at rest are selected (Fig. 14.13). These fi lters are not quadrature fi lters. The squaring of the fi lter responses and further smoothing with xt, however, approximately results in a phase-independent detection of the squared amplitude as with a quadrature fi lter under certain conditions. Let us assume a fi ne-scale periodic structure. The derivative fi lters will preserve these structures but remove the mean gray value. The subsequent squaring of the zero-mean fi lter results yields a mean


398                                                                                                                  14 Motion

 

gray value with half of the squared amplitude of the gray value variations and a rapid spatial oscillation of the gray values with the double wave number (half the wavelength). If the subsequent smoothing removes these fast oscillations, a phase-independent response to the fi lter is obtained just as with a quadra- ture fi lter. In contrast to quadrature fi lters, this result can only be achieved in regions where the scales of the structures are so fi ne that the doubled wave number can be removed with the smoothing fi lter.

 



















Tensor Methods

The tensor method for the analysis of local orientation has already been discussed in detail in Section 13.3. Since motion constitutes locally ori- ented structure in space-time images, all we have to do is to extend the tensor method to three dimensions. First, we will revisit the optimiza- tion criterion used for the tensor approach in Section 14.4.1 in order to distinguish this technique from the diff erential method (Section 14.3).

 

Optimization Strategy

In Section 13.3.1 we stated that the optimum orientation is defi ned as the orientation that shows the least deviations from the direction of the gradient vectors. We introduced the squared scalar product of the gra- dient vector and the unit vector representing the local orientation k ¯   as an adequate measure:

( g k ¯ )2 = | g|2 cos2.∠ ( g, k ¯ )Σ  .                               (14.37)

T
This measure can be used in vector spaces of any dimension. In order to determine orientation in space-time images, we take the spatiotemporal gradient


T
g = Σ ∂ g, ∂ g, ∂ g Σ


= Σ g


, g, g Σ              (14.38)


 

and write


xt                ∂ x ∂ y ∂ t


x   y    t


( xtg k ¯ )2 = | xtg|2 cos2.∠ ( xtg, k ¯ )Σ  .                              (14.39)

For the 2-Dorientation analysis we maximized the expression

∫  w( x x '). g( x ') k ¯ Σ 2 dW x' =. g( x ') k ¯ Σ 2                                           (14.40)

in order to fi nd the optimum orientation. For analysis of motion in space- time images, we are not interested in the direction of maximal gray value changes but in the direction of minimal gray value changes. We denote this orientation by the unit vector e ¯ 3 = [e31, e32, e33]T.  This 3-D vector


14.4 Tensor Methods                                                                    399

Σ
is, according to the considerations in Section 14.2.4 Eq. (14.2), related to the 2-Dvelocity vector by


=
  1

e33


e31 e32


Σ .                                    (14.41)


By analogy to Eq. (14.40) we therefore minimize

∫  w( x x ', t − t'). xtg( x ', t') e ¯ 3Σ 2 d2x'dt' = ( xt e ¯ )2.                         (14.42)

 

The window function w now also extends into the time coordinate and determines the size and shape of the neighborhood around a point [ x, t]T in which the orientation is averaged. Equation (14.42) has to be com- pared with the corresponding expression that is minimized with the dif- ferential method (Eq. (14.16)):

2

.          Σ f ∇ g + g                                     .                 (14.43)
t

Note the subtle diff erence between the two optimization strategies Eqs. (14.42) and (14.43). Both are least squares problems for determining the velocity in such a way that the deviation from the continuity of the optical fl ow becomes minimal. The two methods diff er, however, in the parameters to be estimated. The estimation of a 3-D unit vector turns out to be a so-called total least squares problem [76]. This method is more suitable for the problem because all components of the space-time gradient are though to have statistical errors and not only the temporal derivative as in Eq. (14.43).

By analogy to the discussion in Section 13.3.1 we can conclude that the determination of the optical fl ow in a space-time image is equiva- lent to fi nding the eigenvector   e ¯ 3 of the smallest  eigenvalue λ 3 of the structure tensor J:

         
   

 gxgx gxgy gxgt


J = gxgy gygy  gygt

 gxgt gygt  gtgt

where gpgq with p, q ∈ {x, y, t} is given by


,                        (14.44)


gpgq( x, t) = ∫ w( x x ', t − t')gp( x ', t')gq( x ', t')d2x'dt'.                                   (14.45)

At this point, we can compare the tensor method again with the diff er- ential technique. While the tensor method essentially performs an eigen- value analysis of a symmetric tensor with six regularized products of spatial and temporal derivatives, the diff erential method uses the same products but only fi ve of them. Thus the diff erential technique misses gtgt. We will see in the next section that this additional term enables the tensor method to detect whether a local neighbor shows a constant velocity or not. This is not possible with the diff erential method.


400                                                                                                                  14 Motion

 



















Eigenvalue Analysis

×
×
×
Unfortunately, the eigenvalue analysis of a symmetric 3 3 tensor is not as simple as for a symmetric 2 2 tensor. In two dimensions we could solve the eigenvalue problem directly. In Section 13.3.3 we transformed the three independent components of the symmetric 2 2 tensor into three parameters: the orientation and the rotation-invariant certainty and coherency measures (Section 13.3.4).

×
The symmetric 3 3 tensor now contains six independent compo- nents and we need to fi nd a corresponding number of parameters that describe the local structure of the space-time image adequately. Again it is useful to decompose these six parameters into rotation-variant and rotation-invariant parameters.

As already mentioned, the solution of the eigenvalue problem cannot be written down readily. It requires a suitable numerical algorithm. We will not detail this problem since it is a nontrivial but standard problem of numerical mathematics for which a number of effi cient solutions are available [Press et al., 1992 and Golub and van Loan, 1989]. Thus we assume in the following that we have solved the eigenvalue problem and have obtained a set of three orthonormal eigenvectors and three eigen- values. With the solution of the eigenvalue problem, we have essentially obtained the principal-axes coordinate system in which the structure tensor is diagonal and contains the eigenvalues as diagonal elements:

 λ 1  0     0  


J ' =  


0 λ 2 0

0  0 λ 3


  .                               (14.46)


 

Without restricting generality, the eigenvalues are sorted in descending order:

λ 1 ≥ λ 2 ≥ λ 3 ≥ 0.                                               (14.47)

The principal-axes coordinate system is spanned by the three eigen- vectors. The rotation into this coordinate system requires three indepen- dent parameters as we have discussed in Section 7.2.2. Thus, three of the six parameters are used up to describe its orientation in the space- time domain. This information is contained in the three orthonormal eigenvectors.

The remaining parameters are the three rotation-invariant eigenval- ues. We will now show how the diff erent classes of local structures in space-time images can be diff erentiated by the values of the three eigen- values. This approach will also help us fi nd an effi cient implementation of the tensor-based motion analysis. Four diff erent classes of neighbor- hoods can be distinguished in a space-time image, corresponding to a rank from 0 to 3 for the symmetric tensor:


14.4 Tensor Methods                                                                    401

Constant gray value. All elements and eigenvalues of the structure tensor are zero.

λ 1 = λ 2 = λ 3 = 0.                                              (14.48)

The rank of the tensor is also zero. Therefore no velocity at all can be obtained. This condition is easy to recognize. The sum of the eigenval- ues must be below a critical level, determined by the noise level in the image sequence. As the sum of the eigenvalues is equal to the trace of the tensor, no eigenvalue analysis is required to sort out this condition:

 

.
3

trace( J ) =    gpgp < γ,                                     (14.49)

p=1

where γ is a suitable measure for the noise level in the image sequence. For all points where the condition Eq. (14.49) is met, the eigenvalue analy- sis can be skipped completely.

 

Spatial orientation and constant motion. In this case two eigenvalues are zero since the gray values only change in one direction:

 

λ 1 > 0 and λ 2 = λ 3 = 0.                                        (14.50)

The rank of the tensor is one. The spatial gray value structure shows linear symmetry. This condition can easily be detected again without performing an eigenvalue analysis, because the determinant of the lead- ing 2 × 2 subtensor should be below a threshold γ 2:

gxgx gygy − gxgy 2 < γ 2.                                            (14.51)

The eigenvector e ¯ 1 belonging to the only nonzero (i. e., largest) eigen- value points in the direction of maximum change of gray values. Thus it gives both the spatial orientation and the velocity in this direction. Note that only the normal velocity, i. e., the velocity in the direction of the spatial gradient, can be obtained because of the aperture problem (Section 14.2.2). The spatial orientation is given by the two spatial co- ordinates of the eigenvector e ¯ 1 of the largest eigenvalue. As the normal optical fl ow is in this direction, it is given by


e1t

f =−

  


Σ e1x


Σ                     (14.52)


+ e
2           2

1x        1y


e1y


and the magnitude of the normal optical fl ow reduces to


e
+ e
1t
f             2

 

 

 

2

e
 1t  .

 

 

               


| |=  e2


2 =  1 − e2


(14.53)


402                                                                                                                  14 Motion

 

Distributed spatial structure and constant motion. In this case only one eigenvalue is zero:

λ 1, λ 2 > 0 and λ 3 = 0.                                         (14.54)

Σ
As the motion is constant, the principal-axes coordinate system is mov- ing with the scene.  The eigenvector e ¯ 3 with the zero eigenvalue points in the direction of constant gray values in the space-time domain. Thus the optical fl ow is given by


=
  1

e3t


e3x e3y


Σ                            (14.55)


3x
and its magnitude by                                           


   e2


2             1 − e2

 

                  


| f |= 


 

+ e
3t
3y
= 
2                                 2

3t                               3t


.                        (14.56)


Distributed spatial structure and non-constant motion. In this case all three eigenvalues are larger than zero and the rank of the tensor is three:

λ 1, λ 2, λ 3 > 0.                                                 (14.57)

No useful optical fl ow estimate can be obtained in this case.

After this detailed classifi cation, we turn to the question of which three rotation-invariant parameters can be extracted from the structure tensor in order to obtain a useful description of the local structure in- dependent of the velocity and the spatial orientation of the gray scale parameters.

 

×
Certainty measure. The fi rst parameter is again a certainty measure that gives a measure for the gray value changes. We have two choices. Either we could take the mean square spatial gradient (trace of the up- per 2 2 subtensor) or the mean square spatiotemporal gradient. From a practical point of view the mean square spatial gradient is to be pre- ferred because the spatial gradient does not change in a sequence if the velocity is increasing. The mean square spatiotemporal gradient, how- ever, increases with increasing velocity because higher temporal gradi- ents are added. Thus, surprisingly, the mean square spatial gradient is the better certainty measure:

cc = gxgx + gygy.                                              (14.58)

Spatial coherency measure. As a second measure we take the already known coherency measure from the analysis of 2-Dlocal neighborhoods (Section 13.3.4) and denote it here as the spatial coherency measure:

 

(gxgx − gygy)2 + 4gxgy 2

cs =      (gxgx + gygy)2                         .                     (14.59)


14.5 Second-Order Diff erential Methods‡                                                     403

Its value is between 0 and 1 and decides whether only the normal optical fl ow or both components of the optical fl ow can be determined.

Total coherency measure. Finally, we need an additional measure that tells us whether we encounter a local neighborhood with a constant ve- locity or not. This measure should be independent of the spatial co- herency. The following measure using the largest and smallest eigenval- ues meets this condition:


λ         Σ − λ =t
2

c         1        3

λ 1 + λ 3


.                                     (14.60)


=
The total coherency measure is one as soon as the eigenvalue λ 3 is zero. The other two eigenvalues may then take any other values. The total coherency approaches zero if all three eigenvalues are equal. In contrast to the other two measures cc and cs, the total coherency requires an eigenvalue analysis since the smallest and largest eigenvalues are needed to compute it. There is one caveat with this measure. It is also one with a spatially oriented pattern and a non-constant motion. This special case can be recognized, however, from the condition that both the spatial and total coherency are one but that only one eigenvalue is zero. Another simple criterion is that the eigenvector to the zero eigenvalue lies in the xy plane. This implies that e33 0, so according to Eq. (14.55) we would obtain an infi nite value for the optical fl ow vector.

 

14.5 Second-Order Diff erential Methods‡

14.5.1 Basics‡

The fi rst-order diff erential method has the basic fl aw that the continuity of the optical fl ow gives only one constraint for the two unknown components of the velocity (Section 14.3.1). So far we could only make up for this defi cit by modeling the velocity and thus using a whole neighborhood to determine one estimate of the optical fl ow vector (Section 14.3.2).

An alternative approach is to use multiple feature or multichannel images. Then we can have two or more independent constraints at the same location and thus may be able to determine both components of the optical fl ow at a single point. The essential point, however, is that the new feature must bring really new information into the image. It does not help at all if a new feature is closely related to those already used.

In this way we come to an important generalization of the diff erential method. We can apply any preprocessing to image sequences, or can extract arbitrary feature images and apply all the methods discussed so far. If the continuity of the optical fl ow is preserved for the original image, it is also for any fea- ture image derived from the original image. We can both apply nonlinear point operators as well as any neighborhood operator.


404                                                                                                                  14 Motion

 

14.5.2 Direct Second-Order Solution‡

We fi rst discuss the technique of Girosi et al. [52]. He applied the continuity of the optical fl ow to two feature images, namely the horizontal and vertical spatial derivative:


f gx + gxt = 0

f gy + gyt = 0.


 

(14.61)


The usage of horizontal and vertical derivative images thus results into a second- order diff erential method with the solution

f = − H − 1 gt  if       det H ≠ 0,                               (14.62) where H is the Hesse matrix as defi ned in Eq. (12.4).

14.5.3 Multi-Feature Least Squares Solution‡

If we also include the standard optical fl ow equation, we end up with an overde- termined linear equation system with three equations and two unknowns:


 gx    gy


  f1 


 gt 


 
      
gxx  gxy

gxy  gyy


      f2


=− gxt gyt


  .                     (14.63)


In this respect, fusion of images gained from diff erent sensors may be a promis- ing method. Markandey and Flinchbaugh [118], for example, used multispectral imagery, one visible and one IR image. Image sequence processing of scenes illuminated with light sources from diff erent directions has been studied by Woodham [201]. This approach is especially interesting since it has the poten- tial to detect specular refl exes and thus to exclude an important source of errors in motion estimation.

 

14.5.4 Diff erential Geometric Modeling‡

The discussion in the last sections clearly showed that the spatial structure of the gray values governs the determination of motion. The fi rst-order diff erential method does not adequately account for this basic fact as we have just relied on fi rst-order spatial derivatives. Second-order diff erential methods provide a direct solution, provided the Hesse matrix can be inverted (Eq. (14.61)). In this section, we approach diff erential methods from a diff erent point of view using diff erential geometry. We assume that the gray value structure in the two consecutive images diff ers only by a constant displacement s:

g ( x − 1/2 s, t1) = g ( x + 1/2 s, t2).                                       (14.64)

This approach is another formulation of the continuity equation assuming only a translation of the image and neglecting any rotation or deformation of surface elements. We simply assume that the velocity fi eld does not change in a small neighborhood. For the sake of symmetry, we divide the displacement evenly among the two images. With the assumption that the displacement vector s


14.5 Second-Order Diff erential Methods‡                                                     405

 

=
and the size of the surface element are small, we can expand the gray value in both images at the point x 0 in a Taylor expansion. First we consider a fi rst- order expansion, i. e., we approximate the gray value distribution by a plane:

 

g ( x ± 1/2 s ) = g0 + ∇ g · ( x ± 1/2 s ).                                  (14.65)

The planes in both images must be identical except for the displacement s. We sort the term in Eq. (14.65) in increasing powers of x in order to be able to perform a coeffi cient comparison

     
 

g ( x ± 1/2 s ) = g0 ±1/2 ∇ g s + ∇ g x.                                         (14.66)


   off ˛ s¸ et


r  sl˛ o¸ pre


=
The fi rst and second term contain the off set and slope of the plane, respectively. We can now estimate the displacement s (p, q)T from the condition that both planes must be identical. Consequently, the two coeffi cients must be identical and we obtain two equations:

g0(t1) − g0(t2) = 1/2.∇ g(t1) + ∇ g(t2 s,


∇ g(t1)          =                ∇ g(t2).


(14.67)


 

The second equation states that the gradient must be equal in both images. Otherwise, a plane fi t of the spatial gray value does not seem to be a useful representation. The fi rst equation corresponds to the continuity of the optical fl ow Eq. (14.9). In Eq. (14.67) only the temporal derivative is already expressed in a discrete manner as the diff erence of the mean gray values in both images. Another refi nement is also due to the digitization in time. The gradient is re- placed by the mean gradient of both images. Moreover, we use the displacement vector fi eld (DVF) s instead of the optical fl ow f . As expected, a plane fi t of the gray value distribution does not yield anything new. We are still only able to estimate the velocity component in the direction of the gray value gradient.

Therefore a Taylor series expansion of Eq. (14.64) to the second order gives

 

g ( x ± 1/2 s )  =                                                 g0

+            gx · (x ± 1/2s1) + gy ·.y ± 1/2s2Σ

 

+  1/2gxx · (x ± 1/2s1)2 + 1/2gyy ·.y ± 1/2s2Σ

+                gxy · (x ± 1/2s1).y ± 1/2s2Σ  .

Nagel [128] performed a very similar modeling of the gray value geometry, ex- panding it in a Taylor series up to second order. However, he ended up with quite complex nonlinear equations, which could be solved easily only for spe- cial conditions. He termed them gray value corner and gray value extreme. The reason for the diff erent results lies in the approach towards the solution. Nagel compares the Taylor expansion in the two images in a least square sense, while here a direct coeffi cient comparison is performed.

A comparison of the coeffi cients of the second-order expansion yields six equa- tions in total. The quadratic terms yield three equations which state that all


406                                                                                                                  14 Motion

 

second-order spatial derivatives must coincide in both images:

 

gxx(t1)  =  gxx(t2),

gyy(t1) =  gyy(t2),

gxy(t1) =  gxy(t2).

If this is not the case, either the second-order expansion to the gray value dis- tribution does not adequately fi t the gray value distribution or the presumption of a constant displacement in the neighborhood is not valid. The coeffi cient comparison of the zero- and fi rst-order terms results in the following three equations:


− (g0(t2) − g0(t1))          =


2.gx(t1) + gx(t2)Σ s1

1
2
+ 1 .gy(t1) + gy(t2)Σ  s2,


 

(14.69)


− (gx(t2) − gx(t1))           =          gxxs1 + gxys2,

− (gy(t2) − gy(t1))  =                       gyys2 + gxys1.

Surprisingly, the coeffi cient comparison for the zero-order term (off set) yields the same result as the plane fi t Eq. (14.67). This means that the DVF is computed correctly by a simple plane fi t, even if the gray value distribution is no longer adequately fi tted by a plane but by a second-order polynomial.

The two other equations constitute a simple linear equation system with two unknowns:


 gxx gxy


  s1


 =−  


gx(t2) − gx(t1)


 .         (14.70)


 gxy gyy


  s2 


 gy(t2) − gy(t1) 


×
We can easily invert the left 2 2 matrix like we inverted the matrix, provided that gxxgyy (gxy)2 does not vanish. Therefore it is possible to estimate the displacement between two images from a local neighborhood if we take into account the curvatures of the gray value distribution. We have not yet discussed the conditions the gray value distribution must meet, for Eq. (14.69) to be invertible. This is the case if either a gray value extreme or a gray value corner is encountered. As already mentioned, these terms were coined by Nagel [128]. At a gray value extreme (as well as at a saddle point) both principal curvatures are non-zero. Thus Eq. (14.70) can be solved. At a gray value corner, only one principal curvature is zero, but the gradient in this direction is not. Thus the fi rst and second equation from Eq. (14.69) can be used to determine both components of the optical fl ow vector.

With the diff erential geometric method no smoothing is required since second- order derivatives at only one point are used. However, for a more robust esti- mate of derivatives, often regularized derivative operators are used as discussed in Section 12.5. Since convolution operations are commutative, this smoothing could also be applied after computing the derivatives.

The diff erence in the fi rst-order spatial derivatives between the two images at times t2 and t1 in the right-hand vector in Eq. (14.70) is a discrete approximation


14.6 Correlation Methods                                                              407

of a temporal derivative which can be replaced by a temporal derivative opera- tor. Then, the displacement vector has to be replaced by the optical fl ow vector. Thus a continuous formulation of the diff erential geometric method results in


 gxx gxy

 


  f1  =−  


gxt   .                        (14.71)

 


 gxy gyy


  f2 


 gyt 












































































Correlation Methods

Principle

=
=
As with the diff erential method, the correlation technique is an approach which originates from analyzing the displacement between two consec- utive images. To fi nd a characteristic feature from the fi rst image within the second, we take the fi rst image g(t1) g1 and compare it with the second image g(t2) g2 within a certain search range. Within this range we search for the position of optimum similarity between the two images. When do we regard two features as being similar? The similarity measure should be robust against changes in the illumination. Thus we regard two spatial feature patterns as equal if they diff er only by a constant factor α which refl ects the diff erence in illumination. In the language of inner product vector spaces, this means that the two feature vectors g1 and g2 are parallel. This can be the case if and only if an equality occurs in the Cauchy–Schwarz inequality


. ∫ g ( x )g ( x


2

s )d2x.   ∫   g2( x )d2  ∫


 

2 x  s )d2x.      (14.72)


1           2

.− ∞


≤    1

. − ∞


x g2( −

− ∞


In other words, we need to maximize the cross-correlation coeffi cient

∫ g1( x )g2( x s )d2x


− ∞
− ∞
− ∞
r ( s ) =   ∞                       ∞

 


1/2.           (14.73)

 


 ∫ g2( x )d2x ∫ g2( x s )d2x

 

The cross-correlation coeffi cient is a useful similarity measure. It is zero for totally dissimilar (orthogonal) patterns, and reaches a maximum of one for similar features.

In a similar way as for the diff erential method (Section 14.3), the cor- relation method can be performed by a combination of convolution and point operations. The fi rst step is to introduce a window function w into the defi nition of the cross-correlation coeffi cient. This window is


408                                                                                                                  14 Motion

 

moved around the image to compute the local cross-correlation coeffi - cient. Then Eq. (14.73) becomes

 

∫ w( x x ')g1( x ')g2( x ' − s )d2x'


− ∞
r( x, s ) =   ∞                                                                       ∞

 


1/2. (14.74)

 


 ∫ w( x x ')g2( x ')d2x' ∫ w( x x ')g2( x ' − s )d2x'

                        

                        

The resulting cross-correlation coeffi cient is a four-dimensional func- tion, depending on the position in the image x and the shift s.

 


Поделиться:



Последнее изменение этой страницы: 2019-05-04; Просмотров: 195; Нарушение авторского права страницы


lektsia.com 2007 - 2024 год. Все материалы представленные на сайте исключительно с целью ознакомления читателями и не преследуют коммерческих целей или нарушение авторских прав! (0.505 с.)
Главная | Случайная страница | Обратная связь