Архитектура Аудит Военная наука Иностранные языки Медицина Металлургия Метрология Образование Политология Производство Психология Стандартизация Технологии |
Measures for Fast Algorithms
According to the number of arithmetic operations required, there are many other fast Fourier transform algorithms which are still more eff ec- tive. Most of them are based on polynomial algebra and number theory. An in-depth discussion of these algorithms is given by Blahut [10]. How- ever, the mere number of arithmetic operations is not the only measure for an effi cient algorithm. We must also consider a number of other factors. Access to the data requires additional operations. Consider the sim- ple example of the addition of two vectors. There, besides addition, the following operations are performed: the addresses of the appropriate el- ements must be calculated; the two elements are read into registers, and the result of these additions is written back to the memory. Depending on the architecture of the hardware used, these fi ve operations consti- tute a signifi cant overhead which may take much more time than the addition itself. Consequently, an algorithm with a complicated scheme to access the elements of a vector might add a considerable overhead to the arithmetic operations. In eff ect, a simpler algorithm with more arithmetic operations but less overhead to compute addresses may be faster. Another factor for rating algorithms is the amount of storage space needed. This not only includes the space for the code but also storage space required for intermediate results or tables for constants. For ex- ample, a so-called in-place FFT algorithm, which can perform the Fourier transform on an image without using an intermediate storage area for the image, is very advantageous. Often there is a trade-off between stor- age space and speed. Many integer FFT algorithms, for example, precal-
cated tables. To a large extent the effi ciency of algorithms depends on the com- puter architecture used to implement them. If multiplication is per- formed either in software or by a microcoded instruction, it is much slower than addition or memory access. In this case, the aim of fast al- gorithms is to reduce the number of multiplications even at the cost of more additions or a more complex memory access. Such a strat- egy makes no sense on some modern high-speed architectures where pipelined fl oating-point addition and multiplication take just one clock cycle. The faster the operations on the processor, the more the memory access becomes the bottleneck. Fast algorithms must now consider ef- fective memory access schemes. It is crucial that as many computations as possible can be performed with one and the same set of data. In this way, these data can be kept in a fast intermediate storage area, known as the memory cache, and no direct access to the much slower general memory (RAM) is required. 72 2 Image Representation
The FFT algorithm is a classic example of a fast algorithm. The com- putational savings are enormous. For a 512-element vector, only 1536 instead of 262 144 complex multiplications are needed compared to the direct calculation according to Eq. (2.29). The number of multiplications has been reduced by a factor 170. Using the FFT algorithm, the discrete Fourier transform can no longer be regarded as a computationally expensive operation, since only a few operations are necessary per element of the vector. For a vector with 512 elements, only 3 complex multiplications and 8 complex additions, corresponding to 12 real multiplications and 24 real additions, need to be computed per pixel.
2.5.4 Radix-4 Decimation-in-Time FFT‡
Another partition often used is the radix-4 FFT algorithm. We can decompose a vector into four components: N/4− 1 N/4− 1 gˆ v = . g4nw− 4nv + w− v. g4n+1w− 4nv n=0 N
N/4− 1 N n=0 N
N/4− 1 + w− 2v. g4n+2w− 4nv + w− 3v . g4n+3w− 4nv. N n=0 N N N n=0
given by
2.5 Fast Algorithms for Unitary Transforms 73
0
1
2
4
5
6
Figure 2.24: Signal fl ow diagram of the radix-2 decimation-in-frequency FFT algorithm for N = 8.
or, in matrix notation, gˆ v 1 1 1 1 0gˆ v
gˆ 1 − i − 1 i w− v 1gˆ
w− 3v 3gˆ v
gˆ v 1 0 1 0 1 0 1 0 0gˆ v
gˆ 0 1 0 − i 1 0 − 1 0 w− v 1gˆ
0 1 0 i 0 1 0 − 1 w− 3v 3gˆ v
2.5.5 Radix-2 Decimation-in-Frequency FFT‡ The decimation-in-frequency FFT is another example of a Cooley-Tukey algo- rithm. This time, we break the N-dimensional input vector into N/2 fi rst and N/2 second components. This partition breaks the output vector into its even and odd components: 74 2 Image Representation
N/2− 1
n.=0 N/2− 1 (gn + gn+N/2)w− nv
gˆ 2v+1 = n.=0 WN− n(gn − gn+N/2)w− nv.
2.5.6 Multidimensional FFT Algorithms‡ Generally, there are two possible ways to develop fast algorithms for multidi- mensional discrete Fourier transforms. Firstly, we can decompose the multidi- mensional DFT into 1-D DFTs and use fast algorithms for them. Secondly, we can generalize the approaches of the 1-D FFT for higher dimensions. In this section, we show examples for both possible ways.
Decomposition into 1-D Transforms. A two-dimensional DFT can be bro- ken up in two one-dimensional DFTs because of the separability of the kernel. In the 2-Dcase Eq. (2.37), we obtain 1 M− 1 N− 1
. 2π inv Σ . 2π imu Σ
The inner summation forms M 1-D DFTs of the rows, the outer N 1-D DFTs of the columns, i. e., the 2-DFFT is computed as M row transformations followed by N column transformations
Row transform g˜ m, v = 1 . gm, n exp.− 2 π i nv Σ
M− 1 Column transform gˆ u, v = 1 . g˜ m, v exp.− 2 π i m u Σ . M m=0 M In an analogous way, a W -dimensional DFT can be composed of W one-dimen- sional DFTs.
Multidimensional Decomposition. A decomposition is also directly pos- sible in multidimensional spaces. We will demonstrate such algorithms with the simple case of a 2-Dradix-2 decimation-in-time algorithm.
gˆ u, v 1 1 1 1
gˆ u, v
u, v+N/2 u, v
1 − 1 − 1 1 w− uw− v 1, 1gˆ u, v 2.5 Fast Algorithms for Unitary Transforms 75
Figure 2.25: Decomposition of an image matrix into four partitions for the 2-D radix-2 FFT algorithm.
The superscripts in front of Gˆ denote the corresponding partial transformation. The 2-D radix-2 algorithm is very similar to the 1-D radix-4 algorithm. In a similar manner as for the 1-D radix-4 algorithm (Section 2.5.4), we can reduce the number of additions from 12 to 8 by factorizing the matrix:
1 − 1 − 1 1 0 1 0 − 1 0 0 1 − 1
2.5.7 Transformation of Real Images‡ So far, we have only discussed the Fourier transform of complex-valued signals. The same algorithms can be used also for real-valued signals. Then they are less effi cient, however, because the Fourier transform of a real-valued signal is Hermitian (Section 2.3.5) and thus only half of the Fourier coeffi cients are independent. This corresponds to the fact that also half of the signal, namely the imaginary part, is zero.
76 2 Image Representation
to the Hermitian and anti-Hermitian parts. Thus the Fourier transforms of the two real M-dimensional vectors are given by xˆ v = 1/2(zˆ v + zˆ N∗ − v ), iyˆ v = 1/2(zˆ v − zˆ N∗ − v ). (2.90)
2.6 Further Readings‡
The classical textbook on the Fourier transform — and still one of the best — is Bracewell [11]. An excellent source for various transforms is the “Handbook on Transforms” by Poularikas [141]. For the basics of linear algebra, especially unitary transforms, the reader is referred to one of the modern textbooks on linear algebra, e. g., Meyer [125], Anton [4], or Lay [106]. It is still worthwhile to read the historical article of Cooley and Tukey [22] about the discovery of the fi rst fast Fourier transform algorithm. The monograph of Blahut [10] covers a variety of fast algorithms for the Fourier transform.
|
Последнее изменение этой страницы: 2019-05-04; Просмотров: 216; Нарушение авторского права страницы