Архитектура Аудит Военная наука Иностранные языки Медицина Металлургия Метрология
Образование Политология Производство Психология Стандартизация Технологии


A Simple Example: Linear Regression



= +
The fi t of a straight line to a set of experimental data points x, y is a simple example of a discrete inverse problem. As illustrated in Fig. 17.6, the quantity y is measured as a function of a parameter x. In this case, our model is a straight line with two parameters, the off set a0 and the slope a1: y a0 a1x. With a set of Q data points [xq, yq]T we end up with the linear equation system


1 x1

1 x2


 Σ a0 Σ


y1 

= 
y2

 


 

 

(17.50)


 1 xQ

which can be abbreviated by


    yQ   


Mp = d.                                                 (17.51)

×
The Q 2 matrix M is denoted as the model matrix or design matrix. This matrix refl ects both the type of the model (here a linear regression) and the chosen independent measuring points xq. The model or para- meter vector p contains the parameters of the model to be estimated and the data vector d the measured data xq.

=
If we have only two data points which do not coincide, x1 ≠ x2, we get an exact solution of the linear equation system. If more than two data points are available, we have more equations than unknowns. We say that the equation system is an overdetermined inverse problem. In this case, it is generally no longer possible to obtain an exact solution. We can only compute an estimate of the model parameters p est in the sense that the deviation of the data d from the data predicted with the model d pre Mp est is minimal. This deviation can be expressed by an error vector e:


 

 









Error Norms


e = d d pre = d Mp est.                                      (17.52)


In order to minimize the error vector we need a suitable measure. We may use norms, which we discussed when using inner product vector


462                                                                     17 Regularization and Modeling

 

spaces in Section 2.3.1. Generally, the Ln norm of the Q-dimensional vector e is defi ned as

Q
1/n

 .
n


e n =    |eq|

q=1


.                               (17.53)


A special case is the L norm

n
e = max |eq|.                                          (17.54)

The L2 norm is more commonly used; it is the root of the sum of the squared deviations of the error vector elements

1/2

Q                         


 .                
e 2 =    (dq − dpre, q)2

q=1


.                       (17.55)


Higher norms rate higher deviations with a more signifi cant weighting. The statistics of the data points determines which norm is to be taken. If the measured data points yq have a normal density (Section 3.4.2), the L2 norm must be used [124].

 










Least Squares Solution

The overdetermined linear inverse problem is solved with a minimum

L2 norm of the error vector by

1
p est =. M M Σ −    M d.                                         (17.56)

 

This solution can be made plausible by the following sequence of oper- ations:

 


M T Mp est


1

.
..   Σ T                                     T=
Mpest         =  d                                MT
M  d                          M M

1                  .


 

(17.57)


p est             =  . M M Σ −    M d

provided that the inverse of M M exists.

 

Geometric Illustration

Before we study methods for solving huge linear equation systems, it is helpful to illustrate linear equation systems geometrically. The P model parameters p span a P" =dimensional vector space. This space can be regarded as the space of all possible solutions of an inverse problem with P model parameters. Now, we ask ourselves what it means to have


17.6 Discrete Inverse Problems†                                                                       463

 

a                                                                    b

Figure 17.7: Geometric illustration of the solution of a linear equation system with three unknowns using the Hough transform: a exact soluble equation sys- tem; b overdetermined equation system with a non-unique solution.

 

one data point dq. According to Eq. (17.51), one data point results in one linear equation involving all model parameters M

.
P

mqp' pp' = dq or g q p = dq.                                    (17.58)

k=p'

This equation can be regarded as the scalar product of a row q of the model matrix m q with the model vector p. In the model space, this equa- tion constitutes a P 1-dimensional hyperplane of all vectors p which has a normal vector m q and a distance dq from the origin of the model space. Thus, the linear equation establishes a one-to-one correspon- dence between a data point in the data space and a (P 1)-dimensional hyperplane in the model space. This mapping of data points into the model space is called the Hough transform, which we introduced in Sec- tion 16.5.2. Each data point reduces the space of possible solutions to a (P 1)-dimensional hyperplane in the model space.

×
Figure 17.7a illustrates the solution of a linear equation system with three unknowns. With three equations, three planes meet at a single point, provided that the corresponding 3 3 model matrix is invertible. Even in an overdetermined case, the solution needs not necessarily be unique. Figure 17.7b shows a case of fi ve planes intersecting at a line. Then, the solution is not unique, but only restricted to a line. If this line is oriented along one of the axes, the corresponding model parameter may take any value; the two other model parameters, however, are fi xed. In case of an arbitrarily oriented line, things are more complex. Then, the parameter combinations normal to the line are fi xed, but the parame- ter combination represented by a vector in the direction of the line is not. Using the singular value decomposition [54, 143], we can solve singular


464                                                                     17 Regularization and Modeling

 

linear equation systems and separate the solvable from the unsolvable parameter combinations.

An overdetermined linear equation system that has no unique solu- tion is not just a mathematical curiosity. It is rather a common problem in image processing. We have encountered it already, for example in motion determination with the aperture problem (Section 14.3.2).

 

17.6.5 Derivation of the Least Squares Solution‡

. mq'p'' pp''  .
In this section we provide a derivation of the solution of the overdetermined discrete linear inverse problem (Eq. (17.51)) that minimizes the L2 norm of the error vector. Therefore, we compute the solution explicitly by minimizing the L2 norm of the error vector e (Eq. (17.52)):


e⊗ 2 =
.  dq' − . mq'p' pp'   dq' −
Q             P

 


           P                             


 


q'=1


p'=1


p''=1


Factorizing the sum and interchanging the two summations yields

 

P    P                        Q


2
e ⊗ 2 =


.. pp' pp''. mq'p' mq'p''


     
 

p'=1p''=1                       q'=1

                     ˛ A¸                      r

     


−          2. pp'. mq ' p ' dq'


 

(17.59)


p'=1


q'=1

.
Q ˛ B¸                                                      r


+                          dq' dq'.

q'=1

We fi nd a minimum for this expression by computing the partial derivatives with respect to the parameters pk that are to be optimized. Only the expressions A and B in Eq. (17.59) depend on pk:


∂ A

∂ pk =


P     P                                                                      Q

.  .  .δ k− p'' pp' + δ k− p' pp'' Σ   . mq'p' mq'p''


p'=1p''=1

P            Q


q'=1

P              Q


= . pp'. mq'p' mq'k +. pp''. mq'kmq'p''


p'=1

P


q'=1

Q


p''=1


q'=1


= 2. mp'. mq'p' mq'k,


 

∂ B


p'=1          q'=1

.k
Q


∂ p   =  2     mq'kdq'.

q'=1

We add both derivatives and set them equal to zero:

e ⊗ 2             P            Q                                          Q

 

⊗ 2 = 2. pp'. mq'kmq'p' − 2. mq'kdq' = 0.


∂ pk


p'=1


q'=1


q'=1


17.6 Discrete Inverse Problems†                                                                      465

In order to express the sums as matrix-matrix and matrix-vector multiplications, we substitute the matrix M at two places by its transpose M T :

. pp'. mT ' mq'p' − . mT ' dq' = 0
P            Q                                       Q


p'=1


kq

q'=1


kq

q'=1


and fi nally obtain the matrix equation

             
     

M T M p est = M d  .                                            (17.60)

             
     

P ˛ × ¸ Qr Q˛ × ¸ Pr  ˛ P¸  r     P ˛ × ¸ Qr ˛ Q¸ r

 P˛ × ¸ P  r         ˛ P¸  r

 

   ˛ ¸    r
P

×
This equation can be solved if the quadratic and symmetric P                      P matrix M M

1
is invertible. Then

p est =. M M Σ −    M d.                                            (17.61)

The matrix ( M T M )− 1 M T is known as the generalized inverse M − g of M.

 


Поделиться:



Последнее изменение этой страницы: 2019-05-04; Просмотров: 208; Нарушение авторского права страницы


lektsia.com 2007 - 2024 год. Все материалы представленные на сайте исключительно с целью ознакомления читателями и не преследуют коммерческих целей или нарушение авторских прав! (0.049 с.)
Главная | Случайная страница | Обратная связь