Архитектура Аудит Военная наука Иностранные языки Медицина Металлургия Метрология
Образование Политология Производство Психология Стандартизация Технологии


Motion as Orientation in Space-Time Images



The discussion in Sections 14.2.1–14.2.3 revealed that the analysis of motion from only two consecutive images is plagued by serious prob- lems. The question arises, whether these problems, or at least some of them, can be overcome if we extend the analysis to more than two consecutive images. With two images, we get just a “snapshot” of the motion fi eld. We do not know how the motion continues in time. We cannot measure accelerations and cannot observe how parts of objects appear or disappear as another object moves in front of them.

In this section, we consider the basics of image sequence analysis in a multidimensional space spanned by one time and one to three space coordinates. Consequently, we speak of a space-time image, a spatiotem- poral image, or simple the x t space.

We can think of a three-dimensional space-time image as a stack of consecutive images which may be represented as an image cube as shown in Fig. 14.9. At each visible face of the cube we map a cross section in the corresponding direction. Thus an xt slice is shown on the top face and a yt slice on the right face of the cube. The slices were taken at depths marked by the white lines on the front face, which shows the last image of the sequence.

In a space-time image a pixel extends to a voxel, i. e., it represents a gray value in a small volume element with the extensions ∆ x, ∆ y, and ∆ t. Here we confront the limits of our visual imagination when we try to grasp truly 3-D data (compare the discussion in Section 8.1.1).


382                                                                                                                  14 Motion

 


A                                                             b

t

t

Figure 14.8: Space-time images: a two-dimensional space-time image with one space and one time coordinate; b three-dimensional space-time image.

 

Therefore, we need appropriate representations of such data to make essential features of interest visible.

To analyze motion in space-time images, we fi rst consider a simple example with one space and one time coordinate (Fig. 14.8a). A non- moving 1-Dobject shows vertically oriented gray value structures. If an object is moving, it is shifted from image to image and thus shows up as an inclined gray value structure. The velocity is directly linked to the orientation in space-time images. In the simple case of a 2-Dspace-time image, it is given by

u = − tan ϕ,                                                  (14.1)

where ϕ is the angle between the t axis and the direction in which the gray values are constant. The minus sign in Eq. (14.1) is because angles are positive counterclockwise. The extension to two spatial dimensions is straightforward and illustrated in Fig. 14.8b:


Σ u = −
tan ϕ x

tan ϕ y


Σ .                                     (14.2)


 

The angles ϕ x and ϕ y are defi ned analogously to the angle between the x and y components of a vector in the direction of the constant gray values and the t axis.

A practical example for this type of analysis is shown in Fig. 14.9. The motion is roughly in the vertical direction, so that the yt cross section can be regarded as a 2-D space-time image. The motion is immediately apparent. When the cars stop at the traffi c light, the lines are horizontally


14.2 Basics                                                                                  383

 

Figure 14.9: A 3-D image sequence demonstrated with a traffi c scene in the Hanauer Landstraß e, Frankfurt/Main represented as an image cuboid. The time axis runs into the depth, pointing towards the viewer. On the right side of the cube a yt slice marked by the vertical white line in the xy image is shown, while the top face shows an xt slice marked by the horizontal line (from Jä hne [80]).

 

oriented, and phases with accelerated and constant speed can easily be recognized.

In summary, we come to the important conclusion that motion ap- pears as orientation in space-time images. This fundamental fact forms the basis for motion analysis in x t space. The basic conceptual diff er- ence to approaches using two consecutive images is that the velocity is estimated directly as orientation in continuous space-time images and not as a discrete displacement.

These two concepts diff er more than it appears at fi rst glance. Algo- rithms for motion estimation can now be formulated in continuous x t space and studied analytically before a suitable discretization is applied. In this way, we can clearly distinguish the principal fl aws of an approach from errors induced by the discretization.

Using more than two images, a more robust and accurate determi- nation of motion can be expected. This is a crucial issue for scientifi c applications, as pointed out in Chapter 1.

This approach to motion analysis has much in common with the prob- lem of reconstruction of 3-D images from projections (Section 8.6). Ac- tually, we can envisage a geometrical determination of the velocity by observing the transparent three-dimensional space-time image from dif- ferent points of view. At the right observation angle, we look along the


384                                                                                                                  14 Motion

 

edges of the moving object and obtain the velocity from the angle be- tween the observation direction and the time axis.

If we observe only the edge of an object, we cannot fi nd such an observation angle unambiguously. We can change the component of the angle along the edge arbitrarily and still look along the edge. In this way, the aperture problem discussed in Section 14.2.2 shows up from a diff erent point of view.

 







Motion in Fourier Domain

Introducing the space-time domain, we gain the signifi cant advantage that we can analyze motion also in the corresponding Fourier domain, the k ω space. As an introduction, we consider the example of an image sequence in which all the objects are moving with constant velocity. Such a sequence g( x, t) can be described by

g( x, t) = g( x u t).                                              (14.3)

The Fourier transform of this sequence is

(    32π )
gˆ ( k, ω ) =     1      ∫  ∫  g( x u t) exp[− 2π i( kx − ω t)]d2xdt.                (14.4)


 

Substituting we obtain


t x

 

x ' = x u t,


(2π )3

t


  x'                                                        


gˆ (k, ω ) = 1  ∫    ∫  g(x') exp(− 2π ikx')  exp(− 2π ikut) exp(2π iω t)d2xdt.
The inner integral covers the spatial coordinates and results in the spa- tial Fourier transform gˆ ( k ) of the image g( x ').  The outer integral over the time coordinate reduces to a δ function:

gˆ ( k, ω ) = gˆ ( k )δ ( ku − ω ).                                         (14.5)

This equation states that an object moving with the velocity u occupies only a two-dimensional subspace in the three-dimensional k ω space. Thus it is a line and a plane, in two and three dimensions, respectively. The equation for the plane is given directly by the argument of the δ function in Eq. (14.5):

ω = ku.                                                    (14.6)

This plane intersects the k1k2 plane normally to the direction of the velocity because in this direction the inner product ku vanishes. The slope of the plane, a two-component vector, yields the velocity

kω = k( ku ) = u.


14.2 Basics                                                                                  385

 

The index k in the gradient operator denotes that the partial derivations are computed with respect to the components of k.

From these considerations, it is obvious — at least in principle — how we can determine the velocity in an image sequence showing a constant velocity. We compute the Fourier transform of the sequence and then determine the slope of the plane on which the spectrum of the sequence is located. We can do this best if the scene contains small-scale struc- tures, i. e., high wave numbers which are distributed in many directions. We cannot determine the slope of the plane unambiguously if the spec- trum lies on a line instead of a plane. This is the case when the gray value structure is spatially oriented. From the line in Fourier space we only obtain the component of the plane slope in the direction of the spa- tial local orientation. In this way, we encounter the aperture problem (Section 14.2.2) in the k ω space.

 







Optical Flow

The examples discussed in Section 14.2.1 showed that motion and gray value changes are not equivalent. In this section, we want to quantify this relation. In this respect, two terms are of importance: the motion fi eld and the optical fl ow. The motion fi eld in an image is the real motion of the object in the 3-D scene projected onto the image plane. It is the quantity we would like to extract from the image sequence. The optical fl ow is defi ned as the “fl ow” of gray values at the image plane. This is what we observe. Optical fl ow and motion fi eld are only equal if the objects do not change the irradiance on the image plane while moving in a scene. Although it sounds reasonable at fi rst glance, a more thorough analysis shows that it is strictly true only in very restricted cases. Thus the basic question is how signifi cant the deviations are, so that in practice we can still stick with the equivalence of optical fl ow and motion fi eld.

Two classical examples where the projected motion fi eld and the op- tical fl ow are not equal were given by Horn [73]. The fi rst is a spinning sphere with a uniform surface of any kind. Such a sphere may rotate around any axes through its center of gravity without causing an optical fl ow fi eld. The counterexample is the same sphere at rest illuminated by a moving light source. Now the motion fi eld is zero, but the changes in the gray values due to the moving light source cause a non-zero optical fl ow fi eld.

=
At this point it is helpful to clarify the diff erent notations for motion with respect to image sequences, as there is a lot of confusion in the literature and many diff erent terms are used. Optical fl ow or image fl ow means the apparent motion at the image plane based on visual percep- tion and has the dimension of a velocity. We denote the optical fl ow with f [f1, f2]T. If the optical fl ow is determined from two consecutive im- ages, it appears as a displacement vector (DV ) from the features in the


386                                                                                                                  14 Motion

 

=
=            =
fi rst to those in the second image. A dense representation of displace- ment vectors is known as a displacement vector fi eld (DVF ) s [s1, s2]T. An approximation of the optical fl ow can be obtained by dividing the DVF by the time interval between the two images. It is important to note that optical fl ow is a concept inherent to continuous space, while the displacement vector fi eld is its discrete counterpart. The motion fi eld u [u1, u2]T [u, v]T at the image plane is the projection of the 3-D physical motion fi eld by the optics onto the image plane.

The concept of optical fl ow originates from fl uid dynamics. In case of images, motion causes gray values, i. e., an optical signal, to “fl ow” over the image plane, just as volume elements fl ow in liquids and gases. In fl uid dynamics the continuity equation plays an important role. It expresses the fact that mass is conserved in a fl ow. Can we formulate a similar continuity equation for gray values and under which conditions are they conserved?

In fl uid dynamics, the continuity equation for the density W of the fl uid is given by

∂ W                      ∂ W

∂ t  + ∇ ( u W) =  ∂ t  + u ∇ W + W∇ u = 0.                             (14.7)

This equation is valid for two and three-dimensional fl ows. It states the conservation of mass in a fl uid in a diff erential form. The temporal change in the density is balanced by the divergence of the fl ux density u W. By integrating the continuity equation over an arbitrary volume element, we can write the equation in an integral form:

∫  . ∂ W + ∇ ( u W)Σ  dV =  ∂  ∫  WdV +   W u d a = 0.                  (14.8)

∂ t                                ∂ t

V                                                                       V                  A

The volume integral has been converted into a surface integral around the volume using the Gauss integral theorem. d a is a vector normal to a surface element dA. The integral form of the continuity equation clearly states that the temporal change of the mass is caused by the net fl ux into the volume integrated over the whole surface of the volume.

How can we devise a similar continuity equation for the optical fl ux f

— known as the brightness change constraint equation (BCCE) or optical fl ow constraint (OFC) — in computer vision? The quantity analogous to the density W is the irradiance E or the gray value g. However, we should be careful and examine the terms in Eq. (14.7) more closely. The left divergence term f g describes the temporal brightness change due to a moving gray value gradient. The second term with the divergence of the velocity fi eld g f seems questionable. It would cause a temporal change even in a region with a constant irradiance if the divergence of the fl ow fi eld is unequal to zero. Such a case occurs, for instance, when an object moves away from the camera. The irradiance at the image plane


14.2 Basics                                                                                  387

 

g

 

 

x

 

Figure 14.10: Illustration of the continuity of optical fl ow in the one-dimensional case.

 

remains constant, provided the object irradiance does not change. The collected radiance decreases with the squared distance of the object. However, it is exactly compensated, as also the projected area of the object is decreased by the same factor. Thus we omit the last term in the continuity equation for the optical fl ux and obtain

∂ g

∂ t + f ∇ g = 0.                                              (14.9)

In the one-dimensional case, the continuity of the optical fl ow takes the simple form

∂ g      ∂ g

∂ t + f ∂ x = 0,                                             (14.10) from which we directly get the one-dimensional velocity

=− ∂ t
∂ x
f      ∂ g  ∂ g ,                                           (14.11)

 

provided that the spatial derivative does not vanish. The velocity is thus given as the ratio of the temporal and spatial derivatives.

This basic relation can also be derived geometrically, as illustrated in Fig. 14.10. In the time interval ∆ t a gray value is shifted by the distance

=                                                                   +  −
∆ x  u∆ t causing the gray value to change by g(x, t  ∆ t)  g(x, t). The

gray value change can also be expressed as the slope of the gray value edge,


g(x, t + ∆ t) − g(x, t)


∂ g(x, t) ∆ x


∂ g(x,  t) u∆ t,             (14.12)


=− ∂ x          =− ∂ x

from which, in the limit of ∆ t            0, the continuity equation for optical fl ow Eq. (14.10) is obtained.

The continuity or BCCE equation for optical fl ow at the image plane Eq. (14.9) can in general only be a crude approximation. We have already


388                                                                                                                  14 Motion

 

touched this subject in the introductory section about motion and gray value changes (Section 14.2.1). This is because of the complex nature of the refl ection from opaque surfaces, which depends on the viewing di- rection, surface normal, and directions of the incident light. Each object receives radiation not only directly from light sources but also from all other objects in the scene that lie in the direct line of sight of the object. Thus the radiant emittance from the surface of one object depends on the position of all the other objects in a scene.

In computer graphics, problems of this type are studied in detail, in search of photorealistic computer generated images. A big step towards this goal was a method called radiosity which explicitly solved the inter- relation of object emittance described above [46]. A general expression for the object emittance — the now famous rendering equation — was derived by Kajiya [89].

In image sequence processing, it is in principle required to invert this equation to infer the surface refl ectivity from the measured object emittance. The surface refl ectivity is a feature invariant to surface ori- entation and the position of other objects and thus would be ideal for motion estimation. Such an approach is unrealistic, however, because it requires a reconstruction of the 3-Dscene before the inversion of the rendering equation can be tackled at all.

As there is no generally valid continuity equation for optical fl ow, it is important to compare possible additional terms with the terms in the standard BCCE. All other terms basically depend on the rate of changes of a number of quantities but not on the brightness gradients. If the gray value gradient is large, the infl uence of the additional terms becomes small. Thus we can conclude that the determination of the velocity is most reliable for steep gray value edges while it may be signifi cantly dis- torted in regions with only small gray value gradients. This conclusion is in agreement with Verri and Poggio [190, 191] fi ndings where they point out the diff erence between optical fl ow and the motion fi eld.

Another observation is important. It is certainly true that the histori- cal approach of determining the displacement vectors from only two con- secutive images is not robust. In general we cannot distinguish whether a gray value change comes from a displacement or any other source. However, the optical fl ow becomes more robust in space-time images. We will demonstrate this with two examples.

First, it is possible to separate gray value changes caused by global il- lumination changes from those caused by motion. Figure 14.11 shows an image sequence of a static scene taken at a rate of 5 frames per minute. The two spatiotemporal time slices (Fig. 14.11a, c), indicated by the two white horizontal lines in Fig. 14.11b, cover a period of about 3.4 h. The upper line covers the high-rise building and the sky. From the sky it can be seen that it was partly cloudy, but sometimes there was direct solar


14.2 Basics                                                                                  389

 

a

b

c

Figure 14.11: Static scene with illumination changes: a xt cross section at the upper marked row (sky area) in b; b fi rst image of the sequence; c xt cross section at the lower marked row (roof area) in b; the time axis spans 3.4 h, running downwards (from Jä hne [80]).

 

illumination. The lower line crosses several roof windows, walls, and house roofs.

In both slices the illumination changes appear as horizontal stripes which seem to transparently overlay the vertical stripes, indicating a static scene. As a horizontal patterns indicates an object moving with


390                                                                                                                  14 Motion

 

a

b

Figure 14.12: Traffi c scene at the border of Hanau, Germany; a last image of the sequence; b xt cross section at the marked line in a; the time axis spans 20.5 s, running downwards (from Jä hne [80]).

 

infi nite velocity, these patterns can be eliminated, e. g., by directional fi ltering, without disturbing the motion analysis.

The second example demonstrates that motion determination is still possible in space-time images if occlusions occur and the local illumina- tion of an object is changing because it is turning. Figure 14.12 shows a traffi c scene at the city limits of Hanau, Germany. From the last image of the sequence (Fig. 14.12a) we see that a street crossing with a traffi c light is observed through the branches of a tree located on the right in


14.3 First-Order Diff erential Methods†                                                           391

 

the foreground. One road is running horizontally from left to right, with the traffi c light on the left.

The spatiotemporal slice (Fig. 14.12b) has been cut through the image sequence at the horizontal line indicated in Fig. 14.12a. It reveals various occlusions: the car traces disappear under the static vertical patterns of the tree branches and traffi c signs. We can also see that the temporal trace of the van shows signifi cant gray value changes because it turned at the street crossing and the illumination conditions are changing while it is moving along in the scene. Nevertheless, the temporal trace is con- tinuous and promises a reliable velocity estimate.

We can conclude that the best approach is to stick to the standard BCCE for motion estimates and use it to develop the motion estimators in this chapter. Because of the wide variety of additional terms this ap- proach still seems to be the most reasonable and most widely applicable, because it contains the fundamental constraint.

 

14.3 First-Order Diff erential Methods†











Basics

Diff erential methods are the classical approach to determine motion from two consecutive images. This chapter discusses the question of how these techniques can be applied to space-time images. The conti- nuity equation for the optical fl ow (Section 14.2.6), in short the BCCE or OFC, is the starting point for diff erential methods:

∂ g

∂ t + f ∇ g = 0.                                            (14.13)

=                                                                      ∇
This single scalar equation contains W unknown vector components in the W -dimensional space. Thus we cannot determine the optical fl ow f [f1, f2]T unambiguously. The scalar product f g is equal to the magnitude of the gray value gradient multiplied by the component of f in the direction of the gradient, i. e., normal to the local gray value edge

 

f ∇ g = f|∇ g|.

Thus we can only determine the optical fl ow component normal to the edge. This is the well-known aperture problem, which we discussed qual- itatively in Section 14.2.2. From Eq. (14.13), we obtain

f
∂ g

=− ∂ t  |∇ g|.                                         (14.14)

Consequently, it is not possible to determine the complete vector with fi rst-order derivatives at a single point in the space-time image.


392                                                                                                                  14 Motion

 


Поделиться:



Последнее изменение этой страницы: 2019-05-04; Просмотров: 187; Нарушение авторского права страницы


lektsia.com 2007 - 2024 год. Все материалы представленные на сайте исключительно с целью ознакомления читателями и не преследуют коммерческих целей или нарушение авторских прав! (0.074 с.)
Главная | Случайная страница | Обратная связь