1 Chapter 11 The Radix and the Cass of Radix s FFTs The divideandconuer paradigm introduced in Chapter 3 is not restricted to dividing a probem into two subprobems. In fact, as expained in Section. and Appendix B, the recurrence euation αt α b if α T { k > 1, γ if 1, represents the arithmetic cost of an agorithm which soves the origina probem of size by combining the resuts from recursivey soving α probems of size /α. In this chapter, the cases for α as we as α s are considered The Radix DIT FFTs The DFT of a time series consisting of n discrete sampes is considered in this section. Since n n, any version of the radix FFTs introduced in Sections 3.1 and 3. can certainy be used to compute the transform. The reason it is worthwhie to deveop a radix impementation instead of simpy using the radix FFTs is that the arithmetic cost can be further reduced, and this advantage is carried over to the design of parae FFTs. In fact, both radix and radix FFTs are specia cases in the cass of radix s FFTs. The radix DIT FFT [5, 70, 8] is derived from euation 3.1, which defines the discrete Fourier transform of a compex time series. With the hep of the identities ω j and ω ω, euation 3.1 can be rewritten in terms of four partia sums; 000 by CRC Press LLC
2 that is, 11.1 X r x ω r, r 0, 1,..., 1 0 k0 k0 x k ω rk k0 x k ω rk ω r x k1 ω rk1 k0 k0 x k1 ω rk ω r x k ω rk k0 k0 x k ω rk ω 3r x k3 ω rk3 k0 x k3 ω rk. By decimating the time series into four sets, namey the set {y k y k x k, 0 k / 1}, the set {z k z k x k1, 0 k / 1}, the set {g k g k x k, 0 k / 1}, and the set {h k h k x k3, 0 k / 1}, the four subprobems with period of / can be defined after the appropriate twidde factor ω ω is identified. The four subprobems are 11. Y r 11.3 Z r 11. G r 11.5 H r k0 k0 k0 k0 x k ω rk k0 x k1 ω rk x k ω rk h k3 ω rk x k ω k0 k0 k0 rk x k1 ω x k ω x k3 ω k0 rk y k ω rk, r 0, 1,...,/ 1. k0 rk k0 rk k0 z k ω rk, r 0, 1,...,/ 1. g k ω rk, r 0, 1,...,/ 1. h k ω rk, r 0, 1,...,/ 1. The size of each subprobem is thus /, which is eua to the number of input data points or the number of computed output data points in one period. After these four subprobems are each recursivey soved, the soution to the origina probem of size can be obtained according to 11.1 for r 0, 1,..., 1. Since the series Y r has Y r Y r 3, and the same appies to the series a period of /, Y r Y r Z r, G r, and H r. The output X r may be expressed in terms of Y r, Z r, G r, and H r for 000 by CRC Press LLC
3 r 0, 1,...,/ 1 as shown beow X r X r X r 3 X r Y r ω r Z r ω r G r ω 3r H r. Y r ω r Z r ω r G r ω 3r H r. Y r ω r Z r ω r G r ω 3r H r. Y r ω r 3 Z r ω 3 r 3 3r G r ω H r. By noting that the twidde factors ω ω e j π j, ω ω 3 j 3 j, the four euations above can be simpified to j, and X r X r X r 3 X r Y r ω r Z r ω r G r ω 3r Y r jω r Z r ω r G r jω 3r Y r ω r Z r ω r G r ω 3r Y r jω r Z r ω r G r jω 3r where r 0, 1,...,/ 1. If the radix agorithm is impemented based on euations 11.10, 11.11, 11.1, and 11.13, one step of the radix agorithm wi reuire more arithmetic operations than two steps of the radix agorithm, because some partia resuts were computed more than once. However, if such partia resuts can be identified and computed ony once, one step of the radix agorithm can reuire fewer arithmetic operations than two steps of the radix agorithm, and the tota cost of the radix agorithm can be ower than the radix agorithm. The four recurrent partia resuts are shown beow inside each pair of parentheses. X r Y r ω r G r ω r Z r ω 3r 11.1 X r Y r ω r G r j ω r Z r ω 3r Xr ωz r r ω 3r H r Yr ω r G r, Xr 3 j ωz r r ω 3r H r Yr ω r G r, where r 0, 1,..., / 1. The computation represented by 11.1, 11.15, 11.16, and can now be represented by the two stages of butterfy computation in Figure Anayzing the arithmetic cost To determine the arithmetic cost of the radix FFT agorithm, observe that ω r G r, ωz r r and ω 3r H r need to be computed before the four partia sums can be obtained. Since the size of each subprobem is /, 3/ compex mutipications and compex additions are performed during the first stage of butterfy computation. The second stage of butterfy computation invoves no mutipication by the twidde factors, so ony compex additions are needed. Thus, 3/ compex mutipications and compex additions are reuired to impement the butterfy computation in Figure by CRC Press LLC
4 Figure 11.1 The radix DIT FFT butterfies. Reca again that the arithmetic cost of computer agorithms is measured by the number of rea arithmetic operations, and that one compex addition incurs two rea additions according to.1, and one compex mutipication with precomputed intermediate resuts invoving the rea and imaginary parts of a twidde factor incurs three rea mutipications and three rea additions according to.. Accordingy, 9/ rea mutipications and 5/ rea additions are reuired per step of radix DIT FFT. Thus, one step of the radix DIT FFT agorithm reuires 17/ fops in tota. Since the objective of deveoping the radix agorithm is to minimize the essentia rea operations, a carefu anaysis of the cost shoud excude the trivia mutipication by ω 0 1 and ω j ±1 or±j, since they wi certainy not be done in an efficient impementation. Furthermore, note that the cost of mutipication by a twidde factor which is an odd power of ω 8 1 j/ is ess than the cost of a compex mutipication because j 1 j ω8 1 ω ω 8 j ± 1j or ±. These specia factors are identified from the computation of ω r G r, ω r Z r, and ω 3r H r for r 0, 1,..., 1 beow. Tabe 11.1 Specia cases of twiddefactor mutipication in the radix agorithm. WG P 1 x G,. Gr wdgr jgr WL w;g WLZP 1 x zp ZP W8Zr  W3rH r 1 x H, H, Wi,   Thus, there are eight specia cases: the four cases invoving mutipication by 1 and j are trivia, and the other four cases invoving the mutipication by an odd power of ω 8 are to be treated speciay. The tota nontrivia compex mutipications thus is reduced to 3 8. Since ony fops are needed to compute each of ω 8G r, ω 3 8G r, 000 by CRC Press LLC
5 ω 8 Z r, ω 3 8 the tota fop count becomes Because these specia factors aso occur in every subseuent step, the savings can be incorporated in setting up the recurrence euation. For competeness, assuming that the probem size is n, these specia factors are identified for r0,1,...,/ i1 inthe i th step for i0, 1,...,n in Tabe 11.. Tabe 11. The recurrent specia twiddefactor mutipications in the radix agorithm. o<r<&1 ro r f & 1 r a g L r g &.A Ir.G w/ 1 x G, G, wgr jg, wagr w%g w;;,i, x&, W8.G W$,i H, 1 x HP H, &f, To set up the recurrence euation, the boundary condition for is needed. Reca that when, the twidde factors are the four primitive roots of unity, namey 1,, j and j, so the first stage of butterfy computation invoves no nontrivia compex mutipications. Therefore, when, ony 8 compex additions or 16 rea arithmetic operations are reuired. The cost of the radix FFT agorithm can now be represented by the foowing recurrence: 11.0 T 17 T { 3 if n >, 16 if. Soving 11.0 see Appendix B, one obtains 11.1 T 1 og The derivation above confirms simiar resuts given in [70, 1981] and [6, 1996]. Therefore, compared to the arithmetic cost of T 5 og of the radix agorithm in 3.10, the saving by the radix agorithm is 15 percent. It wi be shown in Chapter 1 that the spitradix agorithm can further reduce the arithmetic cost to T og Θ, which represents a saving of 5 percent compared to the radix agorithm. 11. The Radix DIF FFTs A radix DIF FFT agorithm can be derived from recursivey decimating the freuency series into four subsets, i.e., the set denoted by Y k X k for 0 k / 1, the set denoted by Z k X k1 for 0 k / 1, the set denoted by G k X k for 000 by CRC Press LLC
6 0 k / 1, and the set denoted by H k X k3 for 0 k / 1 as shown beow. The derivation again begins with the DFT definition from X r x ω r, r 0, 1,..., 1, x ω r x ω r 0 x ω r ω r x x x ω r 3 x ω r x ω r 0 x ω r ω r ω r x ω r 0 0 x 3 3 x ω r x ω r x ω r ω 3r ω 3r ω r. 0 0 x 3 ω r 3 x 3 ω r The four subprobems can thus be constructed by substituting r k, r k 1, r k, and r k 3 into the euation above Z k X k1 Y k X k x x x x ω k x x x x x x x ω k x 3 x 3 ω k x 3 y ω k, k 0, 1,...,/ 1. x x ω k1 x j x ω k1 x 3 x 3 z ω k, k 0, 1,...,/ 1. ω k ωω k ω 3 k ω k ω 3k1 ω k1 000 by CRC Press LLC
7 11.5 G k X k x x 11.6 H k X k x x ω k x x ω k x 3 x 3 g ω k, k 0, 1,...,/ 1. x x x x ω k3 x j x ω k3 ω k ω x 3 x 3 h ω k, k 0, 1,...,/ 1. ω 3 k ω ω 3k ω k ω 3k3 ω k3 To form these four subprobems using two stages of butterfy computation, the partia sums identified above are first rearranged to faciitate the butterfy computation as shown beow y z g h j x x x x x x j x 3 x 3 x x x 3 x 3 x x x x, 0 1. ω, 0 1. ω, 0 1. ω 3, 0 1. The computation represented by 11.7, 11.8, 11.9, and can now be represented by the two stages of butterfy computation in Figure 11.. Figure 11. The radix DIF FFT butterfies. 000 by CRC Press LLC
8 11.3 The Cass of Radix s DIT and DIF FFTs The techniues used to deveop the radix and the radix FFT agorithms can be generaized to deveop the entire cass of radix s FFTs. Setting s, a radix DIT FFT agorithm may be deveoped from decomposing 3.1 into partia sums: X r x ω r, r 0, 1,..., 1, 0 u0 k0 ω ur u0 k0 ur ω u0 k0 ω ur u0 x ku ω rku k0 x ku ω rk x ku ω rk x ku ω rk. Observe that 3.3 and 11.1 are specia cases of the euation above when and. According to 11.31, the time series can be decimated into s sets so that each of the partia sums represented by k0 x kuω rk for u 0, 1,..., 1, can be recursivey computed independent of each other. Each partia sum represents the DFT of a subprobem of size /. The output freuencies are computed as separate segments, and each segment denoted by X λ has / consecutive eements indexed by, where 0 / and 0 λ 1. By substituting r λ in 11.31, one obtains the euation for computing the output freuencies in each of the segments. The euation for the λ th segment is shown beow, where 0 λ X λ u0 ω u ω ω u ω uλ u0 uλ x ku ω kλ k0 x ku ω k, 0, 1,...,/ 1. k0 Of course, to minimize the arithmetic cost, the computation of the freuency segments shoud be reorganized to avoid redundant computation as demonstrated earier in the derivation of the radix DIT FFT agorithm. Observe aso that by substituting, λ 0, 1 in 11.3, one obtains 3.7 and 3.8 for computing the two freuency segments in the radix DIT FFT agorithm; on the other hand, by substituting, λ 0, 1,, 3 in 11.3, one obtains the four euations 11.6, 11.7, 11.8, and 11.9 for computing the four freuency segments in the radix DIT FFT agorithm. To deveop a radix DIF FFT agorithm, one woud simpy decimate the freuency series X r into sets with each set containing {Y k u Y k u X ku, 0 k / 1} 000 by CRC Press LLC
9 for u 0, 1,..., 1. Each of the subprobems is thus of size /, and is defined by substituting r k u in as shown for and in deveoping the radix and radix DIF FFT agorithms in Sections 3. and 11.. For competeness, a brief derivation, which iuminates the generaization from the radix and radix agorithms, is provided beow. X r x ω r, r 0, 1,..., 1, λ0 0 0 λ0 0 λ0 0 λ0 0 λ0 { x λ ω rλ x λ ω rλ x λ ω rλ ω r x λ ω λr ω r. The subprobems can thus be constructed by substituting r k u in for u 0, 1,..., 1. Y u k X ku x λ ω λku ω ku λ0 x λ ω λku ω u } ω k y u ω k, k 0, 1,...,/ 1. 0 ote that the / input data points to each subprobems are abeed by y u for u 0, 1,..., 1. To show that the radix and radix DIF FFT are specia cases when and, the generaized formuae for forming each of the s subprobems are expicity identified from 11.3 and it is dispayed once again beow. Observe that when,y in 3.13 and z in 3.15 correspond to y 0 and y 1 in the generaized formua; when,y, z, g and h in euations 11.3 to 11.6 correspond to y 0, y 1, y and y 3 in this generaized formua y u x λ ω λku ω u, 0, 1,...,/ 1. λ0 Since it is known that a radix FFT for s > is ess efficient than the probaby optima spitradix agorithm which recursivey appies both radix and radix agorithms to sove each subprobem [86], further detais on higher radix agorithms are omitted here, and readers are referred to [5, 8] for detais about the popuar radix8 and radix16 FFT agorithms., 000 by CRC Press LLC
