Fast Fourier Transform: Theory and Algorithms Lecture Vladimir Stojanović 6.973 Communication System Design Spring 006 Massachusetts Institute of Technology
Discrete Fourier Transform A review Definition {X k } is periodic Since {X k } is sampled, {x n } must also be periodic From a physical point of view, both are repeated with period N Requires O(N ) operations 6.973 Communication System Design
Fast Fourier Transform History Twiddle factor FFTs (non-coprime sub-lengths) 105 Gauss Predates even Fourier s work on transforms! 1903 Runge 1965 Cooley-Tukey 19 Duhamel-Vetterli (split-radix FFT) FFTs w/o twiddle factors (coprime sub-lengths) 1960 Good s mapping application of Chinese Remainder Theorem ~100 A.D. 1976 Rader prime length FFT 1976 Winograd s Fourier Transform (WFTA) 1977 Kolba and Parks (Prime Factor Algorithm PFA) 6.973 Communication System Design 3
Divide and conquer Divide and conquer always has less computations Suppose all I l sets have same number of elements N 1 so, N=N 1 *N, r=n Each inner-most sum takes N 1 multiplications The outer sum will need N multiplications per output point N *N for the whole sum (for all output points) Hence, total number of multiplications 6.973 Communication System Design
Generalizations The inner-most sum has to represent a Only possible if the subset (possibly permuted) Has the same periodicity as the initial sequence All main classes of FFTs can be cast in the above form Both sums have same periodicity (Good s mapping) No permutations (i.e. twiddle factors) All the subsets have same number of elements m=n/r (m,r)=1 i.e. m and r are coprime If not, then inner sum is one stap of radix-r FFT If r=3, subsets with N/, N/ and N/ elements Split-radix algorithm 6.973 Communication System Design 5
The cost of mapping The goal for divide and conquer Different types balance mapping with subproblem cost E.g. in radix- subproblems are trivial (only sum and differences) Mapping requires twiddle factors (large number of multiplies) E.g. in prime-factor algorithm Subproblems are s with coprime lengths (costly) Mapping trivial (no arithmetic operations) 6.973 Communication System Design 6
FFTs with twiddle factors Reintroduced by Cooley-Tukey 65 Start from general divide and conquer Keep periodicity compatible with periodicity of the input sequence Use decimation almost N 1 s of size N 6.973 Communication System Design 7
Cooley-Tukey FFT contd. can be taken mod N Y = Y n, k n, k 1 1 1. N 1 s of length N. N multplications with twiddle factors 3. N s of length N 1 6.973 Communication System Design
Step 1: Evaluate N 1 s of length N Step : N multiplications with twiddle factors Step 3: Evaluate N s of length N 1 Vector x i mapped to matrix x n1,n (N 1 xn ) Compute N 1 s of length N on each row Point-to-point multiply with twiddle factors Compute N s of length N1 on the columns 6.973 Communication System Design 9
-D view of Cooley-Tukey mapping N=15 (N 1 =3, N =5) x 0 x 1... x 13 x 1 x 0 x 1 x 3 x 6 x 9 x 1 x x x 5 x 7 x 10 x 13 x x 11 x 1 x 0 x 1 x 3 x x 6 x 7 x x 1 x 9 x 13 x 10 x 1 x 11 x 0 x 1 x 3 x x 1 x 9 x 6-5 -5 W jm -3-3 -3-3 -3 x 0 x 13 x x 9 x 1 x x 5 x x5-5 x 5 x 11 x 1 x 10 Figure by MIT OpenCourseWare. Cannot exchange the order of s Because of twiddle multiply Different mapping for N1=5, N=3 Not just transpose x 0 x 5 x 10 x 1 x 6 x 11 x x 7 x 1 x 3 x x 13 x x x 9 1 Figure by MIT OpenCourseWare. 6.973 Communication System Design 10
Radix and radix algorithms Lengths as powers of or are most popular Assume N= n N 1 =, N = n-1 (divides input sequence into even and odd samples decimation in time DIT) Butterfly (sum or difference followed or preceeded by a twiddle factor multiply) X m and X N/+m outputs of N/ -pt s on outputs of, N/-pt s weighted with twiddle factors 6.973 Communication System Design 11
DIT radix- implementations Several different ways Reorder the input data Input samples for inner s in subsequent locations Results in bit-reversed input, in-order output DIT Selectively compute s on evens and odds Perform in-place computation Output in bit-reversed order (X3 in position six (011->110)) X 0 X 1 X X X 5 X 6 x i+1 w X 7 X 7 Division into even and odd numbered sequences N = { x i } N = { } of N / w1 w3 Multiplication by twiddle factors of X 0 X X 1 X 5 X X 6 Figure by MIT OpenCourseWare. Which type is this implementation? 6.973 Communication System Design 1
Decimation in frequency (DIF) radix- implementation If reverse the role of N 1 and N, get DIF N 1 =N/, N = N / -1 n X = 1 k k1 W 1 N / n 1 =0 (x n 1 +x N / +n1 ), N / -1 n X k1 +1 = 1 k W 1 N / n 1 =0 Wn 1(x N n1 -x N / +n1 ). X 0 X 0 X 1 X X X 5 X 6 w1 w N = X X X 6 X 1 X 5 X 7 X 7 w3 { x k } N = { x k+1 } of Multiplication by twiddle factors of N / 6.973 Communication System Design 13 Figure by MIT OpenCourseWare.
Duality DIT<->DIF X 0 X 1 N = X 0 X X 0 X 0 X 1 N = X X { x i } X 1 X 5 X w1 { x k } X X 6 X X 5 N = w1 X X 6 X X 5 w N = X 1 X 6 X 7 { x i+1 } w w3 X 7 X 6 X 7 X 7 w3 { x k+1 } X 5 Division of into even and N / odd numbered sequences Multiplication by twiddle factors of of Multiplication by twiddle factors of N / Which one is DIT (DIF)? How can we get one from another? 6.973 Communication System Design Figure by MIT OpenCourseW. are. 1
Complexity of radix- FFTs of length N replaced by two length-n/ At the cost of N complex multiplications (twiddle) And N complex additions (pt s) Iterate the scheme log N-1 times Obtain trivial transforms (length ) of the length-n/ s Twiddle multiplies (W Ni ) Complex multiply 3 real mult + 3 real add If i is multiple of N/, no arithmetic operation required (why?) butterflies (one general, 3 special cases) 6.973 Communication System Design 15
N= n, N 1 =, N =N/ s of length N/ 3N/ twiddle multiplies N/ s of length Cost of length- No multiplication Only 16 real additions Radix- Reduces the number of stages to log N x 0 x x 15 x 0 X 0 X 1 x X 1 X x 13 X x 1 x 15 1 X 15 X 0 X 1 X 1 X 15 Figure by MIT OpenCourseWare. Radix- can reduce number of operations even more 6.973 Communication System Design 16
Mixed-radix and Split-radix Mixed-radix Diferent radices in different stages Split-radix Different radices in the same stage Simultaneously on different parts of the transform Can achieve lowest number of adds and multiplies for length n inputs x 0 x x 15 x 0 X 0 x x x 1 / x 0 X 0 X 1 x X 1 X x 13 X x 1 x 15 1 X 15 X 0 X 1 X 1 X 15 X 1 X 1 X 13 X 15 R R S-R Figure by MIT OpenCourseWare. 6.973 Communication System Design 17
Split-radix (DIF SRFFT) Look at DIF radix- Xk1 don t have twiddles N / -1 n X k1 = 1 k W 1 N / n 1 =0 (x n +x 1 N / +n1 ), N / -1 n X k1 +1 = 1 k W 1 N / n 1 =0 Wn 1(x N n1 -x N / +n1 ). X 0 X 0 X 1 X X X 5 X 6 w1 w N = X X X 6 X 1 X 5 X 7 X 7 of Even samples X k in DIF should be computed separately from other samples With same algorithm (recursively) as the original sequence No general rule for odd samples w3 Multiplication by twiddle factors Radix- is more efficient than radix- Higher radices are inefficient { x k } N = { x k+1 } of N / X 1 N / -1 n X k1 = W 1 k 1 N / n 1 =0 (x n +x 1 N / +n1 ), N / -1 n X k1 +1 = W 1 k 1 N / n 1 =0 Wn 1 N X[(x n1 -x N / +n1 ) +j(x n1 +N / -x n 1 +3N / )], N / -1 n X k1 +3 = W 1 k 1 n 1 =0 N / W3n 1 N Figure by MIT OpenCourseWare. X[(x n1 +x N / +n1 ) -j(x n1 +N / -x n 1 +3N / )]. X 0 X 0 X X 1 X X 1 / X 13 X 15 Figure by MIT OpenCourseWare.
Split-radix (DIT SRFFT) Dual to DIF SRFFT Considers separately subsets {x i }, {x i+1 } and {x i+3 } Redundancy in X k, X k+n/, X k+n/, X k+3n/ computation 6.973 Communication System Design 19
FFTs without twiddle factors Divide and conquer requirements N-long computed from s with lengths that are factors of N (allows the inner sum to be a ) Provided that subsets I l guarantee periodic x i When N factors into co-prime factors N=N 1 *N Starting from any x i form subset with compatible periodicity (the periodicity of the subset divides the periodicity of the set) {x + n n = 1,..., N 1} or { x n = 1,..., N 1} i N n 1 1 i N 1 + 1 Both subsets have only one common point x i Allows a rearrangement of the input (periodic) vector into a matrix with a periodicity in both dimensions (rows and columns), both periodicities being comatible with the initial one 6.973 Communication System Design 0
Good s mapping FFTs without twiddle factors all based on the same mapping Turns original transform into a set of small s with coprime lengths {x i N + n n = 1,..., N 1} or {x n = 1,..., 1 i+ N n 1 1 N 1 1} 0 1 3 5 6 7 9 10 11 1 13 1 equivalent to Good's mapping 0 5 10 3 13 6 11 1 9 1 1 7 This mapping is one-to-one if N1 and N are coprime All congruences modulo N1 obtained For a given congruence modulo N and vice versa Figure by MIT OpenCourseW. are. 6.973 Communication System Design 1
Just another arrangement of CRT Chinese Remainder Theorem (CRT) If we know the residue of some number k modulo two coprime numbers N 1 and N k N k 1 N It is possible to reconstruct Let k N = k 1 k N = k 1 k = Ntk + Then NN 11 N 1 N 1 k NN 1 t k 0 1 3 5 6 7 9 10 11 1 13 1 t 1 N 1 = 1 and t N N = 1 N 1 t 1 multiplicative inverse of N 1 mod N t multiplicative inverse of N mod N 1 t 1, t always exist since N 1, N coprime (N 1,N )=1 Good's mapping 0 5 10 3 13 6 11 1 9 1 1 7 What are t 1, t for N 1 =3, N =5? CRT mapping 0 10 5 6 1 11 1 7 3 13 9 1 Reversing N 1 and N Results in transposed mapping 6.973 Figure by MIT OpenCourseW. are.
Impact on Formulating the true multi-dimensional transform N - 1 ik X k = Σ x i W N, k = 0,..., N - 1, i = 0 k = Ntk + N t k NN 11 1 1 N N 1-1 N - 1 X N1 t 1 k + N r k 1 = Σ Σ x n1 N + n N 1 W N n 1 = 0 n = 0 (n 1 N + N 1 n )(N 1 t 1 k + N t k 1 ) x 3 x 6 x 9 x 1 5 X 9 X N W N = W N1 N t (N WN1 = W t ) N1 N1 = W N1 N 1-1 N - 1 X N1 t 1 k + N t k 1 = Σ Σ x n1 N + n N 1 W N1 W N n 1 = 0 n = 0 n 1 k n k, x 0 x 5 x 10 3 X 1 X X X 11 X 5 X' k1 k = X N1 t 1 k + N t k 1 x' n1. n = x n 1 N + n N 1 Figure by MIT OpenCourseWare. N 1-1 N - 1 n 1 k 1 X' k1 k = Σ Σ x' n1 n W N1 W N n 1 = 0 n = 0 n k Figure by MIT OCW. True bidimensional transform! (no extra twiddle factors) 6.973 Communication System Design 3
Using convolution to compute s All sub s are prime length Rader showed that prime-length s can be computed as a result of cyclic convolution E.g. length 5 Permute last two rows and columns Cyclic correlation (a convolution with a reversed sequence) This is a general result 6.973 Communication System Design
Example Results in smallest number of multiplies 6.973 Communication System Design 5
Prime Factor Algorithm X = F 1 x F T. Good's mapping CRT mapping x 3 x 6 x 1 x 9 X 9 5 X x 0 X 1 x 5 3 X 11 X X x 10 X 5 Figure by MIT OpenCourseWare. Efficient small s are a key to the feasibility of this algorithm 6.973 Communication System Design 6
Winograd s Fourier Transform Algorithm x 0 x 3 x 6 x 9 x 1 x 5 x 10 X 9 X X 5 X 11 X X X 1 input additions input additions point wise output additions output additions N = 3 N = 5 multiplication N = 5 N = 3 Figure by MIT OpenCourseWare. B 1 xb only involves additions D diagonal (so point multiply) Winograd transform has many more additions than twiddle FFTs 6.973 Communication System Design 7