Архитектура Аудит Военная наука Иностранные языки Медицина Металлургия Метрология
Образование Политология Производство Психология Стандартизация Технологии


World and Camera Coordinates



7.2.1 Defi nition

dinates describe the horizontal and X3'
the vertical positions, respec-
Basically, the position of objects in 3-D space can be described in two diff erent ways (Fig. 7.1). First, we can use a coordinate system which is related to the scene observed. These coordinates are called world

T

coordinates  and denoted as X '  =  Σ X1', X2', X3' Σ   .  The X1'  and X2'  coor-

 

177

B. Jä hne, Digital Image Processing                                                                                                       Copyright © 2002 by Springer-Verlag

ISBN 3–540–67754–2                                                                                                    All rights of reproduction in any form reserved.


178                                                                                               7 Image Formation

 

world coordinates

 

Figure 7.1: Illustration of world and camera coordinates.

 

 

=
tively. Sometimes, an alternative convention with non-indexed coordi- nates X ' [X', Y ', Z']T is more convenient. Both notations are used in this book.

=
A second system, the camera coordinates X [X1, X2, X3]T, can be fi xed to the camera observing the scene. The X3 axis is aligned with the optical axis of the camera system (Fig. 7.1). Physicists are familiar with such considerations. It is common to discuss physical phenomena in diff erent coordinate systems. In elementary mechanics, for example, motion is studied with respect to two observers, one at rest, the other moving with the object.

Transition from world to camera coordinates generally requires a translation and a rotation. First, we shift the origin of the world co- ordinate system to the origin of the camera coordinate system by the translation vector T (Fig. 7.1). Then we change the orientation of the shifted system by rotations about suitable axes so that it coincides with the camera coordinate system. Mathematically, translation can be de- scribed by vector subtraction and rotation by multiplication of the coor- dinate vector with a matrix:


 

Rotation


X = R ( X ' − T ).                                                 (7.1)


Rotation of a coordinate system has two important features. It does not change the length or norm of a vector and it keeps the coordinate system orthogonal. A transformation with these features is known in linear algebra as an orthonormal transform.

The coeffi cients in a transformation matrix have an intuitive meaning. This can be seen when we apply the transformation to unit vectors E ¯ p


7.2 World and Camera Coordinates                                               179

in the direction of the coordinate axes. With E ¯ 1, for instance, we obtain


¯ '   ¯


 a11 a12 a13


  1 


 a11 


E 1 = A E 1 =    a21  a22  a23


  0  =  a21


 .       (7.2)


a31 a32 a33


  0 


 a31 


Thus, the columns of the transformation matrix give the coordinates of the base vectors in the new coordinate system. Knowing this property, it is easy to formulate the orthonormality condition that has to be met by the rotation matrix R:


R T R = I or


3

.
rkmrlm = δ k− l,                                     (7.3)

m=1


where I denotes the identity matrix, whose elements are one and zero on diagonal and non-diagonal positions, respectively. Using Eq. (7.2), this equation simply states that the transformed base vectors remain orthogonal:

k
l
E ¯ 'T E ¯ ' = δ kl.                                                    (7.4)

Equation Eq. (7.3) leaves three matrix elements independent out of nine. Unfortunately, the relationship between the matrix elements and three parameters to describe rotation turns out to be quite complex and nonlinear. A common procedure involves the three Eulerian rotation angles (φ, θ, ψ ). A lot of confusion exists in the literature about the defi nition of the Eulerian angle. We follow the standard mathematical approach. We use right-hand coordinate systems and count rotation an- gles positive in the counterclockwise direction. The rotation from the shifted world coordinate system to the camera coordinate system is de- composed into three steps (see Fig. 7.2, [53]).

1. Rotation about X3' axis by the angle φ, X '' = R φ X ':

 cos φ   sin φ 0 

 

R φ =    −  sin φ  cos φ  0                                (7.5)

2. Rotation about X1'' axis by θ, X ''' = R θ X '':

 1          0        0 


 
R θ =  0   cos θ   sin θ

0 − sin θ     cos θ

3. Rotation about X3''' axis by ψ, X = R ψ X ''':


                     (7.6)


 cos ψ   sin ψ   0 

 

R ψ =    −  sin ψ       cos ψ    0                       (7.7)


180                                                                                               7 Image Formation

 


b

a

X'3''


 

X'3

 

q


c

X'2''

X


 

X'3


 


X'2

 

1


 

 

f

X'    X''' =X''


 

X'2


 

X'2


1           1  1

 

Figure 7.2: Rotation of world coordinates X ' to camera coordinates X using the three Eulerian angles (φ, θ, ψ ) with successive rotations about the a X3', b X1'', and c X3''' axes.

 

Cascading the three rotations, R ψ R θ R φ , yields the matrix

 

cos ψ cos φ − cos θ sin φ sin ψ                cos ψ sin φ + cos θ cos φ sin ψ              sin θ sin ψ

sin ψ cos φ − cos θ sin φ cos ψ             − sin ψ sin φ + cos θ cos φ cos ψ              sin θ cos ψ  .

 −                              sin θ sin φ                                     − sin θ cos φ               cos θ 

 

The inverse transformation from camera coordinates to world coor- dinates is given by the transpose of the above matrix. Since matrix mul- tiplication is not commutative, rotation is also not commutative. There- fore, it is important not to interchange the order in which rotations are performed.

Rotation is only commutative in the limit of an infi nitesimal rotation. Then, the cosine and sine terms reduce to 1 and ε, respectively. This limit has some practical applications since minor rotational misalignments are common.

Rotation about the X3 axis, for instance, can be


 1 ε 0 


X1  =  X1' + ε X2'


X = R ε X ' =    − ε      1  0    X '      or

 


X2  =  X2' −  ε X1'  . X3  =  X3'


 

T
As an example we discuss the rotation of the point Σ X1', 0, 0Σ   .  It is ro-

tated into the point Σ X1', ε X1', 0Σ  while the correct position would be


T
Σ X1' cos ε, X1' sin ε, 0Σ


. Expanding the trigonometric function in a Taylor

T


series to third order yields a position error of Σ 1/2ε 2X1', 1/6ε 3X1', 0Σ   .

For a 512 × 512 image (X1'  <  256 for centered rotation) and an error

±  = ±
limit of less than 1/20 pixel, ε must be smaller than 0.02 or 1.15 °. This is still a signifi cant rotation vertically displacing rows by up to                                                                                                             ε X'      5

pixels.


7.3 Ideal Imaging: Perspective Projection†                                                   181

 

 

Figure 7.3: Image formation with a pinhole camera.

 

7.3 Ideal Imaging: Perspective Projection†












































The Pinhole Camera

The basic geometric aspects of image formation by an optical system are well modeled by a pinhole camera. The imaging element of this camera is an infi nitesimally small hole (Fig. 7.3). The single light ray coming from a point of the object at [X1, X2, X3]T which passes through this hole meets the image plane at [x1, x2, di]T. Through this condition an image of the object is formed on the image plane. The relationship between the 3-Dworld and the 2-D image coordinates [x1, x2]T is given by


x1           d'X1

 


d'X2

2


=− X3  ,     x =− X3  .                                   (7.8)

The two world coordinates parallel to the image plane are scaled by the factor d'/X3. Therefore, the image coordinates [x1, x2]T contain only ratios of world coordinates, from which neither the distance nor the true size of an object can be inferred.

A straight line in the world space is projected onto a straight line at the image plane. This important feature can be proved by a sim- ple geometric consideration. All light rays emitted from a straight line pass through the pinhole. Consequently they all lie on a plane which is spanned by the straight line and the pinhole. This plane intersects with the image plane in a straight line.

All object points on a ray through the pinhole are projected onto a single point in the image plane. In a scene with several transparent ob- jects, the objects are projected onto each other. Then we cannot infer the three-dimensional structure of the scene at all. We may not even be able to recognize the shape of individual objects. This example demon- strates how much information is lost by projection of a 3-D scene onto a 2-Dimage plane.


182                                                                                               7 Image Formation

 

occluded space

occluded surface

 

Figure 7.4: Occlusion of more distant objects and surfaces by perspective pro- jection.

 

Most natural scenes, however, contain opaque objects. Here the ob- served 3-D space is essentially reduced to 2-D surfaces. These sur- faces can be described by two two-dimensional functions g(x1, x2) and X3(x1, x2) instead of the general description of a 3-D scalar gray value image g(X1, X2, X3). A surface in space is completely projected onto the image plane provided that not more than one point of the surface lies on the same ray through the pinhole. If this condition is not met, parts of the surface remain invisible. This eff ect is called occlusion. The oc- cluded 3-Dspace can be made visible if we put a point light source at the position of the pinhole (Fig. 7.4). Then the invisible parts of the scene lie in the shadow of those objects which are closer to the camera.

As long as we can exclude occlusion, we only need the depth map X3(x1, x2) to reconstruct the 3-Dshape of a scene completely. One way to produce it — which is also used by our visual system — is by stereo imaging, i. e., observation of the scene with two sensors from diff erent points of view (Section 8.2.1).

 





Projective Imaging

Imaging with a pinhole camera is essentially a perspective projection, because all rays must pass through one central point, the pinhole. Thus the pinhole camera model is very similar to imaging with penetrating rays, such as x-rays, emitted from a point source (Fig. 7.5). In this case, the object lies between the central point and the image plane.

The projection equation corresponds to Eq. (7.8) except for the sign:


    X2
 X1 X3


 

Σ
x1

  − → x2


Σ  =  


d'X1 X3 d'X2 X3


 .                         (7.9)


7.3 Ideal Imaging: Perspective Projection†                                                   183

 

 

Figure 7.5: Perspective projection with x-rays.

 

 

The image coordinates divided by the image distance di are called

generalized image coordinates:


=1
x˜      x 1 , d'


x˜      x 2 .                                      (7.10)

=2
d'


 

Generalized image coordinates are dimensionless and denoted by a tilde. They are equal to the tangent of the angle with respect to the optical axis of the system with which the object is observed. These coordinates ex- plicitly take the limitations of the projection onto the image plane into account. From these coordinates, we cannot infer absolute positions but know only the angle at which the object is projected onto the im- age plane. The same coordinates are used in astronomy. The general projection equation of perspective projection Eq. (7.9) then reduces to


 X1 


X 1 


X =  X2


X3

 
 
  − → x ˜ =


 .                          (7.11)


X3
 X3               X 2

 

We will use this simplifi ed projection equation in all further consider- ations. For optical imaging, we just have to include a minus sign or, if speaking geometrically, refl ect the image at the origin of the coordinate system.

 

7.3.3 Homogeneous Coordinates‡

In computer graphics, the elegant formalism of homogeneous coordinates [37, 46, 122] is used to describe all the transformations we have discussed so far,

i. e., translation, rotation, and perspective projection, in a unifi ed framework. This formalism is signifi cant, because the whole image formation process can be expressed by a single 4 × 4 matrix.

Homogeneous coordinates are represented by a four-component column vector

X ' = Σ tX1', tX2', tX3', tΣ T ,                                              (7.12)


184                                                                                               7 Image Formation

 

×
from which ordinary three-dimensional coordinates are obtained by dividing the fi rst three components of the homogeneous coordinates by the fourth. Any arbitrary transformation can be obtained by premultiplying the homogeneous coordinates with a 4 4 matrix M. In particular, we can obtain the image coor- dinates

x = [sx1, sx2, sx3, s]T                                (7.13)

by

x = MX.                                                      (7.14)

Since matrix multiplication is associative, we can view the matrix M as com- posed of many transformation matrices, performing such elementary transfor- mations as translation, rotation around a coordinate axis, perspective projec- tion, and scaling. The transformation matrices for the elementary transforms are readily derived:


1 0 0 T1
0 1 0 T
0 0 1 T3
0 0 0 1

 

                     2 

=

                         


 

Translation by [T1, T2, T3]T


 


 

R x1 =


 1    0        0     0 

 0  cos θ    − sin θ    0 
                                           
0 sin θ    cos θ    0 0                        0                         0                         1


Rotation about X1


axis by θ


R x2 =


                                          
cos φ 0 sin φ 0
0 1 0 0
sin φ 0 cos φ 0
  0 0 1

 

    −  0


Rotation about X2 axis by φ

 


(7.15)


R x3  =   


cos ψ   − sin ψ    0  0

sin ψ     cos ψ    0  0

 
0         0     1  0

0         0     0  1


 

Rotation about X3 axis by ψ


s1 0 0 0
0 s 0 0
0 0 s3 0
0 0 0 1

 

        2                     

=

                           

 

=
1 0 0 0
0 1 0 0
0 0 1 0
0 0 − 1/d' 1

 

P

                                   


 

 

Scaling

 

Perspective projection.


 

Perspective projection is formulated slightly diff erently from the defi nition in Eq. (7.11). Premultiplication of the homogeneous vector

X = [tX1, tX2, tX3, t]T


with P yields


 

                        d' −  X   1                  2        3
T

tX, tX, tX, t         3

d'


,                                  (7.16)


7.4 Real Imaging                                                                          185

optical system

parallel light rays


focal point

F1


principal

p1 points              p2


back focal length


  F2


front focal length

effective focal length


parallel light rays effective focal

length


Figure 7.6: Black box model of an optical system.

from which we obtain the image coordinates by division through the fourth coordinate

Σ     Σ  X1  d'              


x1

x2 = 


d' − X3

d'

 


 .                                 (7.17)


 X2 d' − X3

From this equation we can see that the image plane is positioned at the origin,

since if X3 = 0, both image and world coordinates are identical. The center of projection has been shifted to [0, 0, − d']T.

Complete transformations from world coordinates to image coordinates can be composed of these elementary matrices. Strat [179], for example, proposed the following decomposition:

M = CSPR z R y R x T .                                               (7.18)

The scaling S and cropping (translation) C are transformations taking place in the two-dimensional image plane. Strat [179] shows how the complete trans- formation parameters from camera to world coordinates can be determined in a noniterative way from a set of calibration points whose positions in the space are exactly known. In this way, an absolute calibration of the outer camera pa- rameters position and orientation and the inner parameters piercing point of the optical axis, focal length, and pixel size can be obtained.

 


















































Real Imaging


Поделиться:



Последнее изменение этой страницы: 2019-05-04; Просмотров: 219; Нарушение авторского права страницы


lektsia.com 2007 - 2024 год. Все материалы представленные на сайте исключительно с целью ознакомления читателями и не преследуют коммерческих целей или нарушение авторских прав! (0.146 с.)
Главная | Случайная страница | Обратная связь