Motion as Orientation in Space-Time Images

⇐ ПредыдущаяСтр 63 из 85Следующая ⇒

The discussion in Sections 14.2.1–14.2.3 revealed that the analysis of motion from only two consecutive images is plagued by serious prob- lems. The question arises, whether these problems, or at least some of them, can be overcome if we extend the analysis to more than two consecutive images. With two images, we get just a “snapshot” of the motion ﬁ eld. We do not know how the motion continues in time. We cannot measure accelerations and cannot observe how parts of objects appear or disappear as another object moves in front of them.

In this section, we consider the basics of image sequence analysis in a multidimensional space spanned by one time and one to three space coordinates. Consequently, we speak of a space-time image, a spatiotem- poral image, or simple the x t space.

We can think of a three-dimensional space-time image as a stack of consecutive images which may be represented as an image cube as shown in Fig. 14.9. At each visible face of the cube we map a cross section in the corresponding direction. Thus an xt slice is shown on the top face and a yt slice on the right face of the cube. The slices were taken at depths marked by the white lines on the front face, which shows the last image of the sequence.

In a space-time image a pixel extends to a voxel, i. e., it represents a gray value in a small volume element with the extensions ∆ x, ∆ y, and ∆ t. Here we confront the limits of our visual imagination when we try to grasp truly 3-D data (compare the discussion in Section 8.1.1).

382 14 Motion

A b

Figure 14.8: Space-time images: a two-dimensional space-time image with one space and one time coordinate; b three-dimensional space-time image.

Therefore, we need appropriate representations of such data to make essential features of interest visible.

To analyze motion in space-time images, we ﬁ rst consider a simple example with one space and one time coordinate (Fig. 14.8a). A non- moving 1-Dobject shows vertically oriented gray value structures. If an object is moving, it is shifted from image to image and thus shows up as an inclined gray value structure. The velocity is directly linked to the orientation in space-time images. In the simple case of a 2-Dspace-time image, it is given by

u = − tan ϕ, (14.1)

where ϕ is the angle between the t axis and the direction in which the gray values are constant. The minus sign in Eq. (14.1) is because angles are positive counterclockwise. The extension to two spatial dimensions is straightforward and illustrated in Fig. 14.8b:

Σ u = −

tan ϕ _x

tan ϕ _y

Σ . (14.2)

The angles ϕ _x and ϕ _y are deﬁ ned analogously to the angle between the x and y components of a vector in the direction of the constant gray values and the t axis.

A practical example for this type of analysis is shown in Fig. 14.9. The motion is roughly in the vertical direction, so that the yt cross section can be regarded as a 2-D space-time image. The motion is immediately apparent. When the cars stop at the traﬃ c light, the lines are horizontally

14.2 Basics 383

Figure 14.9: A 3-D image sequence demonstrated with a traﬃ c scene in the Hanauer Landstraß e, Frankfurt/Main represented as an image cuboid. The time axis runs into the depth, pointing towards the viewer. On the right side of the cube a yt slice marked by the vertical white line in the xy image is shown, while the top face shows an xt slice marked by the horizontal line (from Jä hne [80]).

oriented, and phases with accelerated and constant speed can easily be recognized.

In summary, we come to the important conclusion that motion ap- pears as orientation in space-time images. This fundamental fact forms the basis for motion analysis in x t space. The basic conceptual diﬀ er- ence to approaches using two consecutive images is that the velocity is estimated directly as orientation in continuous space-time images and not as a discrete displacement.

These two concepts diﬀ er more than it appears at ﬁ rst glance. Algo- rithms for motion estimation can now be formulated in continuous x t space and studied analytically before a suitable discretization is applied. In this way, we can clearly distinguish the principal ﬂ aws of an approach from errors induced by the discretization.

Using more than two images, a more robust and accurate determi- nation of motion can be expected. This is a crucial issue for scientiﬁ c applications, as pointed out in Chapter 1.

This approach to motion analysis has much in common with the prob- lem of reconstruction of 3-D images from projections (Section 8.6). Ac- tually, we can envisage a geometrical determination of the velocity by observing the transparent three-dimensional space-time image from dif- ferent points of view. At the right observation angle, we look along the

384 14 Motion

edges of the moving object and obtain the velocity from the angle be- tween the observation direction and the time axis.

If we observe only the edge of an object, we cannot ﬁ nd such an observation angle unambiguously. We can change the component of the angle along the edge arbitrarily and still look along the edge. In this way, the aperture problem discussed in Section 14.2.2 shows up from a diﬀ erent point of view.

Motion in Fourier Domain

Introducing the space-time domain, we gain the signiﬁ cant advantage that we can analyze motion also in the corresponding Fourier domain, the k ω space. As an introduction, we consider the example of an image sequence in which all the objects are moving with constant velocity. Such a sequence g( x, t) can be described by

g( x, t) = g( x − u t). (14.3)

The Fourier transform of this sequence is

( 32π )

gˆ ( k, ω ) = 1 ∫ ∫ g( x − u t) exp[− 2π i( kx − ω t)]d²xdt. (14.4)

Substituting we obtain

t x

x ' = x − u t,

(2π )3

  x'  

gˆ (k, ω ) = 1 ∫  ∫ g(x') exp(− 2π ikx') exp(− 2π ikut) exp(2π iω t)d²xdt.

The inner integral covers the spatial coordinates and results in the spa- tial Fourier transform gˆ ( k ) of the image g( x '). The outer integral over the time coordinate reduces to a δ function:

gˆ ( k, ω ) = gˆ ( k )δ ( ku − ω ). (14.5)

This equation states that an object moving with the velocity u occupies only a two-dimensional subspace in the three-dimensional k ω space. Thus it is a line and a plane, in two and three dimensions, respectively. The equation for the plane is given directly by the argument of the δ function in Eq. (14.5):

ω = ku. (14.6)

This plane intersects the k₁k₂ plane normally to the direction of the velocity because in this direction the inner product ku vanishes. The slope of the plane, a two-component vector, yields the velocity

∇ _kω = ∇ _k( ku ) = u.

14.2 Basics 385

The index k in the gradient operator denotes that the partial derivations are computed with respect to the components of k.

From these considerations, it is obvious — at least in principle — how we can determine the velocity in an image sequence showing a constant velocity. We compute the Fourier transform of the sequence and then determine the slope of the plane on which the spectrum of the sequence is located. We can do this best if the scene contains small-scale struc- tures, i. e., high wave numbers which are distributed in many directions. We cannot determine the slope of the plane unambiguously if the spec- trum lies on a line instead of a plane. This is the case when the gray value structure is spatially oriented. From the line in Fourier space we only obtain the component of the plane slope in the direction of the spa- tial local orientation. In this way, we encounter the aperture problem (Section 14.2.2) in the k ω space.

Optical Flow

The examples discussed in Section 14.2.1 showed that motion and gray value changes are not equivalent. In this section, we want to quantify this relation. In this respect, two terms are of importance: the motion ﬁ eld and the optical ﬂ ow. The motion ﬁ eld in an image is the real motion of the object in the 3-D scene projected onto the image plane. It is the quantity we would like to extract from the image sequence. The optical ﬂ ow is deﬁ ned as the “ﬂ ow” of gray values at the image plane. This is what we observe. Optical ﬂ ow and motion ﬁ eld are only equal if the objects do not change the irradiance on the image plane while moving in a scene. Although it sounds reasonable at ﬁ rst glance, a more thorough analysis shows that it is strictly true only in very restricted cases. Thus the basic question is how signiﬁ cant the deviations are, so that in practice we can still stick with the equivalence of optical ﬂ ow and motion ﬁ eld.

Two classical examples where the projected motion ﬁ eld and the op- tical ﬂ ow are not equal were given by Horn [73]. The ﬁ rst is a spinning sphere with a uniform surface of any kind. Such a sphere may rotate around any axes through its center of gravity without causing an optical ﬂ ow ﬁ eld. The counterexample is the same sphere at rest illuminated by a moving light source. Now the motion ﬁ eld is zero, but the changes in the gray values due to the moving light source cause a non-zero optical ﬂ ow ﬁ eld.

At this point it is helpful to clarify the diﬀ erent notations for motion with respect to image sequences, as there is a lot of confusion in the literature and many diﬀ erent terms are used. Optical ﬂ ow or image ﬂ ow means the apparent motion at the image plane based on visual percep- tion and has the dimension of a velocity. We denote the optical ﬂ ow with f [f₁, f₂]T. If the optical ﬂ ow is determined from two consecutive im- ages, it appears as a displacement vector (DV ) from the features in the

386 14 Motion

= =

ﬁ rst to those in the second image. A dense representation of displace- ment vectors is known as a displacement vector ﬁ eld (DVF ) s [s₁, s₂]T. An approximation of the optical ﬂ ow can be obtained by dividing the DVF by the time interval between the two images. It is important to note that optical ﬂ ow is a concept inherent to continuous space, while the displacement vector ﬁ eld is its discrete counterpart. The motion ﬁ eld u [u₁, u₂]T [u, v]T at the image plane is the projection of the 3-D physical motion ﬁ eld by the optics onto the image plane.

The concept of optical ﬂ ow originates from ﬂ uid dynamics. In case of images, motion causes gray values, i. e., an optical signal, to “ﬂ ow” over the image plane, just as volume elements ﬂ ow in liquids and gases. In ﬂ uid dynamics the continuity equation plays an important role. It expresses the fact that mass is conserved in a ﬂ ow. Can we formulate a similar continuity equation for gray values and under which conditions are they conserved?

In ﬂ uid dynamics, the continuity equation for the density W of the ﬂ uid is given by

∂ W ∂ W

∂ t + ∇ ( u W) = ∂ t + u ∇ W + W∇ u = 0. (14.7)

This equation is valid for two and three-dimensional ﬂ ows. It states the conservation of mass in a ﬂ uid in a diﬀ erential form. The temporal change in the density is balanced by the divergence of the ﬂ ux density u W. By integrating the continuity equation over an arbitrary volume element, we can write the equation in an integral form:

∫ . ∂ W + ∇ ( u W)Σ dV = ∂ ∫ WdV +  W u d a = 0. (14.8)

∂ t ∂ t

V V A

The volume integral has been converted into a surface integral around the volume using the Gauss integral theorem. d a is a vector normal to a surface element dA. The integral form of the continuity equation clearly states that the temporal change of the mass is caused by the net ﬂ ux into the volume integrated over the whole surface of the volume.

How can we devise a similar continuity equation for the optical ﬂ ux f

∇

— known as the brightness change constraint equation (BCCE) or optical ﬂ ow constraint (OFC) — in computer vision? The quantity analogous to the density W is the irradiance E or the gray value g. However, we should be careful and examine the terms in Eq. (14.7) more closely. The left divergence term f g describes the temporal brightness change due to a moving gray value gradient. The second term with the divergence of the velocity ﬁ eld g f seems questionable. It would cause a temporal change even in a region with a constant irradiance if the divergence of the ﬂ ow ﬁ eld is unequal to zero. Such a case occurs, for instance, when an object moves away from the camera. The irradiance at the image plane

14.2 Basics 387

Figure 14.10: Illustration of the continuity of optical ﬂ ow in the one-dimensional case.

remains constant, provided the object irradiance does not change. The collected radiance decreases with the squared distance of the object. However, it is exactly compensated, as also the projected area of the object is decreased by the same factor. Thus we omit the last term in the continuity equation for the optical ﬂ ux and obtain

∂ g

∂ t + f ∇ g = 0. (14.9)

In the one-dimensional case, the continuity of the optical ﬂ ow takes the simple form

∂ g ∂ g

∂ t + f ∂ x = 0, (14.10) from which we directly get the one-dimensional velocity

=− ∂ t

∂ x

f ∂ g  ∂ g , (14.11)

provided that the spatial derivative does not vanish. The velocity is thus given as the ratio of the temporal and spatial derivatives.

This basic relation can also be derived geometrically, as illustrated in Fig. 14.10. In the time interval ∆ t a gray value is shifted by the distance

= + −

∆ x u∆ t causing the gray value to change by g(x, t ∆ t) g(x, t). The

gray value change can also be expressed as the slope of the gray value edge,

g(x, t + ∆ t) − g(x, t)

∂ g(x, t) ∆ x

∂ g(x, t) u∆ t, (14.12)

=− ∂ x =− ∂ x

→

from which, in the limit of ∆ t 0, the continuity equation for optical ﬂ ow Eq. (14.10) is obtained.

The continuity or BCCE equation for optical ﬂ ow at the image plane Eq. (14.9) can in general only be a crude approximation. We have already

388 14 Motion

touched this subject in the introductory section about motion and gray value changes (Section 14.2.1). This is because of the complex nature of the reﬂ ection from opaque surfaces, which depends on the viewing di- rection, surface normal, and directions of the incident light. Each object receives radiation not only directly from light sources but also from all other objects in the scene that lie in the direct line of sight of the object. Thus the radiant emittance from the surface of one object depends on the position of all the other objects in a scene.

In computer graphics, problems of this type are studied in detail, in search of photorealistic computer generated images. A big step towards this goal was a method called radiosity which explicitly solved the inter- relation of object emittance described above [46]. A general expression for the object emittance — the now famous rendering equation — was derived by Kajiya [89].

In image sequence processing, it is in principle required to invert this equation to infer the surface reﬂ ectivity from the measured object emittance. The surface reﬂ ectivity is a feature invariant to surface ori- entation and the position of other objects and thus would be ideal for motion estimation. Such an approach is unrealistic, however, because it requires a reconstruction of the 3-Dscene before the inversion of the rendering equation can be tackled at all.

As there is no generally valid continuity equation for optical ﬂ ow, it is important to compare possible additional terms with the terms in the standard BCCE. All other terms basically depend on the rate of changes of a number of quantities but not on the brightness gradients. If the gray value gradient is large, the inﬂ uence of the additional terms becomes small. Thus we can conclude that the determination of the velocity is most reliable for steep gray value edges while it may be signiﬁ cantly dis- torted in regions with only small gray value gradients. This conclusion is in agreement with Verri and Poggio [190, 191] ﬁ ndings where they point out the diﬀ erence between optical ﬂ ow and the motion ﬁ eld.

Another observation is important. It is certainly true that the histori- cal approach of determining the displacement vectors from only two con- secutive images is not robust. In general we cannot distinguish whether a gray value change comes from a displacement or any other source. However, the optical ﬂ ow becomes more robust in space-time images. We will demonstrate this with two examples.

First, it is possible to separate gray value changes caused by global il- lumination changes from those caused by motion. Figure 14.11 shows an image sequence of a static scene taken at a rate of 5 frames per minute. The two spatiotemporal time slices (Fig. 14.11a, c), indicated by the two white horizontal lines in Fig. 14.11b, cover a period of about 3.4 h. The upper line covers the high-rise building and the sky. From the sky it can be seen that it was partly cloudy, but sometimes there was direct solar

14.2 Basics 389

Figure 14.11: Static scene with illumination changes: a xt cross section at the upper marked row (sky area) in b; b ﬁ rst image of the sequence; c xt cross section at the lower marked row (roof area) in b; the time axis spans 3.4 h, running downwards (from Jä hne [80]).

illumination. The lower line crosses several roof windows, walls, and house roofs.

In both slices the illumination changes appear as horizontal stripes which seem to transparently overlay the vertical stripes, indicating a static scene. As a horizontal patterns indicates an object moving with

390 14 Motion

Figure 14.12: Traﬃ c scene at the border of Hanau, Germany; a last image of the sequence; b xt cross section at the marked line in a; the time axis spans 20.5 s, running downwards (from Jä hne [80]).

inﬁ nite velocity, these patterns can be eliminated, e. g., by directional ﬁ ltering, without disturbing the motion analysis.

The second example demonstrates that motion determination is still possible in space-time images if occlusions occur and the local illumina- tion of an object is changing because it is turning. Figure 14.12 shows a traﬃ c scene at the city limits of Hanau, Germany. From the last image of the sequence (Fig. 14.12a) we see that a street crossing with a traﬃ c light is observed through the branches of a tree located on the right in

14.3 First-Order Diﬀ erential Methods† 391

the foreground. One road is running horizontally from left to right, with the traﬃ c light on the left.

The spatiotemporal slice (Fig. 14.12b) has been cut through the image sequence at the horizontal line indicated in Fig. 14.12a. It reveals various occlusions: the car traces disappear under the static vertical patterns of the tree branches and traﬃ c signs. We can also see that the temporal trace of the van shows signiﬁ cant gray value changes because it turned at the street crossing and the illumination conditions are changing while it is moving along in the scene. Nevertheless, the temporal trace is con- tinuous and promises a reliable velocity estimate.

We can conclude that the best approach is to stick to the standard BCCE for motion estimates and use it to develop the motion estimators in this chapter. Because of the wide variety of additional terms this ap- proach still seems to be the most reasonable and most widely applicable, because it contains the fundamental constraint.

14.3 First-Order Diﬀ erential Methods†

Basics

Diﬀ erential methods are the classical approach to determine motion from two consecutive images. This chapter discusses the question of how these techniques can be applied to space-time images. The conti- nuity equation for the optical ﬂ ow (Section 14.2.6), in short the BCCE or OFC, is the starting point for diﬀ erential methods:

∂ g

∂ t + f ∇ g = 0. (14.13)

= ∇

This single scalar equation contains W unknown vector components in the W -dimensional space. Thus we cannot determine the optical ﬂ ow f [f₁, f₂]T unambiguously. The scalar product f g is equal to the magnitude of the gray value gradient multiplied by the component of f in the direction of the gradient, i. e., normal to the local gray value edge

f ∇ g = f_⊥|∇ g|.

Thus we can only determine the optical ﬂ ow component normal to the edge. This is the well-known aperture problem, which we discussed qual- itatively in Section 14.2.2. From Eq. (14.13), we obtain



f_⊥

∂ g

=− ∂ t |∇ g|. (14.14)

Consequently, it is not possible to determine the complete vector with ﬁ rst-order derivatives at a single point in the space-time image.

392 14 Motion

⇐ Предыдущая 58 59 60 61 626364 65 66 67 Следующая ⇒

Последнее изменение этой страницы: 2019-05-04; Просмотров: 213; Нарушение авторского права страницы