Shape Presentation and Analysis

Introduction

All operations discussed in Chapters 11–15 extracted features from im- ages that are represented as images again. Even the morphological op- erations discussed in Chapter 18 that analyze and modify the shape of segmented objects in binary images work in this way. It is obvious, how- ever, that the shape of objects can be represented in a much more com- pact form. All information on the shape of an object is, for example, contained in its boundary pixels.

In Section 19.2, we therefore address the question of how to represent a segmented object. We will study the representation of binary objects with the run-length code (Section 19.2.1), the quadtree (Section 19.2.2), and the chain code (Section 19.2.3). Two further object representations, moments and Fourier descriptors, are of such signiﬁ cance that we devote entire sections to them (Sections 19.3 and 19.4).

A compact representation for the shape of objects is not of much use if it takes a lot of eﬀ ort to compute it and if it is cumbersome to com- pute shape parameters directly from it. Therefore we address also the question of shape parameter extraction from the diﬀ erent shape repre- sentations in Section 19.5.

Shape parameters are extracted from objects in order to describe their shape, to compare it to the shape of template objects, or to parti- tion objects into classes of diﬀ erent shapes. In this respect the impor- tant questions arises how shape parameters can be made invariant on certain transformations. Objects can be viewed from diﬀ erent distances and from diﬀ erent points of view. Thus it is of interest to ﬁ nd shape pa- rameters that are scale and rotation invariant or that are even invariant under aﬃ ne or perspective projection.

Representation of Shape

Run-Length Code

A compact, simple, and widely used representation of an image is run- length code. The run-length code is produced by the following procedure. An image is scanned line by line. If a line contains a sequence of p equal

495

ISBN 3–540–67754–2 All rights of reproduction in any form reserved.

496 19 Shape Presentation and Analysis

a) Gray value image

Original line (hex): 12 12 12 20 20 20 20 25 27 25 20 20 20 20 20 20

Code (hex): 82 12 83 20 2 25 27 25 85 20

b) Binary image

Original line (hex): 1 1 1 1 1 1 0 0 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1

Code (hex) 0 6 3 3 2 1 5 8

Figure 19.1: Demonstration of the run-length code for a gray value image,

b binary image.

pixels, we do not store p times the same value, but store the pixel value once and indicate that it occurs p times (Fig. 19.1). In this way, large uniform line segments can be stored very eﬃ ciently.

For binary images, the code can be especially eﬃ cient as only the two pixel values zero and one occur. As a sequence of zeroes is always fol- lowed by a sequence of ones, there is no need to store the pixel value. We only need to store the number of times a pixel value occurs (Fig. 19.1b). We must be careful at the beginning of a line, however, as it may begin with a one or a zero. This problem can be resolved if we assume a line to begin with zero. If a line should start with a sequence of ones, we start the run-length code with a zero to indicate that the line begins with a sequence of zero zeroes (Fig. 19.1b).

Run-length code is suitable for compact storage of images. It has become an integral part of several standard image formats, for example, the TGA or the TIFF ﬁ le formats. Run-length code is less useful for direct processing of images, however, because it is not object oriented. As a result, run-length encoding is more useful for compact image storage. Not all types of images can be successfully compressed with this scheme. Digitized gray-scale images, for example, always contain some noise so that the probability for suﬃ ciently long sequences of pixels with the same gray value is very low. However, high data reduction factors can be achieved with binary images and many types of computer-generated gray-scale and color images.

Quadtrees

The run-length codes discussed in Section 19.2.1 are a line-oriented rep- resentation of binary images. Thus they encode one-dimensional rather than two-dimensional data. The two-dimensional structure is actually not considered at all. In contrast, a quadtree is based on the principle of recursive decomposition of space, as illustrated in Fig. 19.2 for a binary image.

19.2 Representation of Shape 497

A b

NW NE SW SE

Figure 19.2: Representation of a binary image by a region quadtree: a succes- sive subdivision of the binary image into quadrants; b the corresponding region quadtree.

First, the whole image is decomposed into four equal-sized quad- rants. If one of the quadrants does not contain a uniform region, i. e., the quadrant is not included entirely in the object or background, it is again subdivided into four subquadrants. The decomposition stops if only uniform quadrants are encountered or if the quadrants ﬁ nally con- tain only one pixel.

The recursive decomposition can be represented in a data structure known in computer science as tree (Fig. 19.2b). At the top level of the tree, the root, the decomposition starts. The root corresponds to the entire binary image. It is connected via four edges to four child nodes which represent from left to right the NW, NE, SW, and SE quadrants. If a quadrant needs no further subdivision, it is represented by a terminal node or a leaf node in the tree. It is called black when the quadrant belongs to an object and white otherwise, indicated by a ﬁ lled and open square, respectively. Nonleaf nodes require further subdivision and are said to be gray and shown as open circles (Fig. 19.2b).

Quadtrees can be encoded, for example, by a depth-ﬁ rst traversal of the tree starting at the root. It is only required to store the type of the node with the symbols b (black), w (white), and g (gray). We start the code with the value of the root node. Then we list the values of the child nodes from left to right. Each time we encounter a gray node, we continue encoding at one level lower in the tree. This rule is applied recursively. This means that we return to a higher level in the tree only after the visited branch is completely encoded down to the lowest level. This is why this encoding is named depth-ﬁ rst.

The example quadtree shown in Fig. 19.2b results in the code

ggwwgwwwbbwggwbwbbwgwbwwgbwgbbwww.

The code becomes more readable if we include a left parenthesis each time we descend one level in the tree and a right parenthesis when we

498 19 Shape Presentation and Analysis

A b

2 1

3 1

4 0 2 0

5 7

6 3

Figure 19.3: Direction coding in a an 8-neighborhood and b a 4-neighborhood.

ascend again

g(g(wwg(wwwb)b)wg(g(wbwb)bwg(wbww))g(bwg(bbww)w)).

However, the code is unique without the parentheses. A quadtree is a compact representation of a binary image if it contains many leaf nodes at high levels. However, in the worst case, for example a regular checker- board pattern, all leaf nodes are at the lowest level. The quadtree then contains as many leaf nodes as pixels and thus requires many more bytes of storage space than the direct representation of the binary image as a matrix.

The region quadtree discussed here is only one of the many possi- bilities for recursive spatial decomposition. Three-dimensional binary images can be recursively decomposed in a similar way. The 3-D im- age is subdivided into eight equally sized octants. The resulting data structure is called a region octree. Quadtrees and octrees have gained signiﬁ cant importance in geographic information systems and computer graphics.

Quadtrees are a more suitable encoding technique for images than the line-oriented run-length code. But they are less suitable for image analy- sis. It is rather diﬃ cult to perform shape analysis directly on quadtrees. Without going into further details, this can be seen from the simple fact that an object shifted by one pixel in any direction results in a com- pletely diﬀ erent quadtree. Region quadtrees share their most important disadvantage with run-length code: the technique is a global image de- composition and not one that represents objects extracted from images in a compact way.

Chain Code

In contrast to run-length code and quadtrees, chain code is an object- related data structure for representing the boundary of a binary object eﬀ ectively on a discrete grid. Instead of storing the positions of all the boundary pixels, we select a starting pixel and store only its coordinate. If we use an algorithm that scans the image line by line, this will be the uppermost left pixel of the object. Then we follow the boundary

19.2 Representation of Shape 499

A b

Figure 19.4: Boundary representation with the chain code: a 8-neighborhood;

b 4-neighborhood.

in a clockwise direction. In a 4-neighborhood there are 4 and in an 8- neighborhood 8 possible directions to go, which we can encode with a 3-bit or 2-bit code as indicated in Fig. 19.3. The boundary and its code extracted in this way is shown in Fig. 19.4 for a 4-neighborhood and an 8-neighborhood.

The chain code shows a number of obvious advantages over the ma- trix representation of a binary object:

First, the chain code is a compact representation of a binary object. Let us assume a disk-like object with a diameter of R pixels. In a direct matrix representation we need to store the bounding box of the object (see Section 19.5.4), i. e., about R2 pixels which are stored in R2 bits. The bounding rectangle is the smallest rectangle enclosing the object. If we use an 8-connected boundary, the disk has about π R boundary points. The chain code of the π R points can be stored in about 3π R bits. For objects with a diameter larger than 10, the chain code is a more compact representation.

Second, the chain code is a translation invariant representation of a binary object. This property makes it easier to compare objects. How- ever, the chain code is neither rotation nor scale invariant. This is a signiﬁ cant disadvantage for object recognition, although the chain code can still be used to extract rotation invariant parameters, such as the area of the object.

Third, the chain code is a complete representation of an object or curve. Therefore, we can — at least in principle — compute any shape feature from the chain code.

As shown in Section 19.5, we can compute a number of shape pa- rameters — including the perimeter and area — more eﬃ ciently using the chain-code representation than in the matrix representation of the binary image. The limitation here is, of course, that the chain code is a digital curve on a discrete grid and as such describes the boundary of the object only within the precision of the discrete grid.

500 19 Shape Presentation and Analysis

If the object is not connected or if it has holes, we need more than one chain code to represent it. We must also include information on whether the boundary surrounds an object or a hole. Reconstruction of the binary image from a chain code is an easy procedure: we can draw the outline of the object and then use a ﬁ ll operation to paint it.

⇐ Предыдущая 76 77 78 79 808182 83 84 85 Следующая ⇒

Последнее изменение этой страницы: 2019-05-04; Просмотров: 195; Нарушение авторского права страницы