Lecture 5 Fixed Point vs Floating Point Objectives: Understand fixed point representations Understand scaling, overflow and rounding in fixed point Understand Q-format Understand TM32C67xx floating point representations Understand relationship between the two in C6x architecture Reference: "What Every Computer cientist hould Know About Floating-Point Arithmetic" by David GoldbergACM Computing urveys 23, 5 (March 1991). Lecture 5 - Fixed point vs Floating point 5-1 Q-Format number representation N-bit fixed point, 2 s complement number is given by: x = b N-1 N 1 N 2 1 N 12 + bn 22 + + b1 2 + b 2 imaginary binary point Difficult to work with due to possible overflow & scaling problems Often normalise number to some fractional representation (e.g. between ± 1) x = b N-1 1 N 2 N 1 N 12 + bn 22 + + b1 2 + b 2 imaginary binary point Lecture 5 - Fixed point vs Floating point 5-2 Q-format notation How to store Q3 number to 16-bit memory? Q-format representation: if N=16, 15 bit fractional representation Q15 format Rule: Q m + Q m Q m Q m x Q n Q m+n Assume 16-bit data format, Q15 x Q15 Q3 X Q15 Q15 Q3 15 31 16 15 toring Q3 number to 16-bit memory requires rounding or truncation: Q3 Rounding: 31 16 if r =, round down, r = 1, round up rounding by addition a '1' here 15 r + 1 MPY A3,A4,A6 ; A3 x A4 A6 NOP ; Delay slot K 4h,A6 ; rounding add HR A6,15,A6 ; truncate bottom 15 bits TH A6,*A7 ; A6 mem[a7] Lecture 5 - Fixed point vs Floating point 5-3 Lecture 5 - Fixed point vs Floating point 5-4
Avoid overflow with afe add routine in C to avoid overflow - saturation add instruction Always clip to max (or min) possible et bit 9 of the CR register to indicate saturation has occurred Lecture 5 - Fixed point vs Floating point 5-5 Lecture 5 - Fixed point vs Floating point 5-6 ingle Precsion Floating Point number Easy (and lazy) way of dealing with scaling problem 32-bit single precision floating point: single precision 31 3 23 22 x = 8-bit exp s 1 2 1.175 1 38 exp 127 23-bit frac 1. frac MB is sign-bit (same as fixed point) 8-bit exponent in bias-127 integer format (i.e., add 127 to it) 23-bit to represent only the fractional part of the mantissa. The MB of the mantissa is ALWAY 1, therefore it is not stored < x < 1.7 1 38 Double Precision Floating Point number 64-bit double precision floating point: double precision 31 3 2 19 31 11-bit exp 52-bit frac Odd register (e.g. A5) Even register (e.g. A4) x s = 1 2 2.2 1 38 < exp 123 1. frac x < 1.7 1 MB is sign-bit (same as fixed point) 11-bit exponent in bias-123 integer format (i.e., add 123 to it) 52-bit to represent only the fractional part of the mantissa. The MB of the mantissa is ALWAY 1, therefore it is not stored 38 Lecture 5 - Fixed point vs Floating point 5-7 Lecture 5 - Fixed point vs Floating point 5-8
Convert 5.75 to P FP Examples 5.75 to binary: +1.111... x 2 2 exponent in bias-127 is 127+2 = 129 = 1 b The fractional part is.111... after we drop the hidden 1 bit. Answer: 11 111... = 4B8 (hex) Convert.1 to DP FP.1 to binary: 1.1111(11 repeats) x 2-4 exponent in bias-123 is 123-4 = 119 = 11 1111 111 b The fractional part is.1111...11 after we drop the hidden 1 bit and rounding Answer: 111111111 111...11 11 = 3FB9 9999 9999 999A (hex). Problems of Q-format Wrong Q-format representation will give totally wrong results Even correct use of Q-format notation may reduce precision For this example, Q12 result is totally wrong, and Q8 result is imprecise: Q12 7.5195 111. 1 1 Q12 7.25 * 111. 1 Q24 54.38916 11 11. 11 11 11 Q12 6.38916 Q8 54.38281 Lecture 5 - Fixed point vs Floating point 5-9 Lecture 5 - Fixed point vs Floating point 5-1 Data types used by C6x DPs pecial P numbers IEEE floating point standard has a set of special numbers: pecial ign (s) Exponent (e) Fraction (f) Hex Value Decimal + x. - 1 x8 -. 1 127 x3f8 1. 2 128 x4 2. +Inf 255 x7f8 + -Inf 1 255 xff8 - NaN x 255 Nonzero x7fff FFFF not a number LFPN 254 All 1 s x7f7f FFFF 3.4282347 e+38 FPN 1 All s x8 1.17549435e-38 Lecture 5 - Fixed point vs Floating point 5-11 Lecture 5 - Fixed point vs Floating point 5-12
pecial DP numbers TM32C67x Internal ystem Architecture Double precision floating point special numbers: pecial Exponent (e) Fraction (f) Hex Value Decimal + x. - x8 -. 1 123 x3ff 1. 2 124 x4 2. +Inf 247 x7ff + -Inf 247 xfff - NaN 247 Nonzero x7fff FFFF FFFF FFFF not a number LFPN 246 All 1 s x7fef FFFF FFFF FFFF 1.7976931348623157 e+38 FPN 1 All s x1 2.2257385857214 e-38 External Memory P E R I P H E R A L Regs (A-A15/31) Internal Memory Internal Buses.D1.M1.L1.1 CPU.D2.M2.L2.2 Regs (B-B15/31) Lecture 5 - Fixed point vs Floating point 5-13 Lecture 5 - Fixed point vs Floating point 5-14 Four functional units for each datapath Mapping of instructions to functional units..l.d.m K 2 AND B CLR EXT C K KH AB LDB LDDW. Unit NOT OR ET HL HR HL UB UB2 XOR ZERO ABP ABDP CMPGTP CMPEQP CMPLTP CMPGTDP CMPEQDP CMPLTDP RCPP RCPDP RQRP RQRDP PDP.D Unit (B/H/W) TB (B/H/W) (B/H/W) UB UBAB (B/H/W) ZERO AB AND CMPEQ CMPGT CMPLT LMBD NORM MPY MPYH MPYLH MPYHL.L Unit NOT P OR DP UBP AT UBDP UB INTP UB INTDP UBC PINT XOR DPINT ZERO PRTUNC DPTRUNC DPP.M Unit MPY MPYH MPYP MPYDP MPYI MPYID No Unit Used NOP IDLE Lecture 5 - Fixed point vs Floating point 5-15 Lecture 5 - Fixed point vs Floating point 5-16
Detailed internal datapaths Data path A Data path B Lecture 5 - Fixed point vs Floating point 5-17