Representation of Floating Point Numbers in

Similar documents

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit

This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers

ECE 0142 Computer Organization. Lecture 3 Floating Point Representations

Today. Binary addition Representing negative numbers. Andrew H. Fagg: Embedded Real- Time Systems: Binary Arithmetic

CHAPTER 5 Round-off errors

The string of digits in the binary number system represents the quantity

Levent EREN A-306 Office Phone: INTRODUCTION TO DIGITAL LOGIC

1. Give the 16 bit signed (twos complement) representation of the following decimal numbers, and convert to hexadecimal:

MATH-0910 Review Concepts (Haugen)

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

LSN 2 Number Systems. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology

Measures of Error: for exact x and approximation x Absolute error e = x x. Relative error r = (x x )/x.

Computer Science 281 Binary and Hexadecimal Review

Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML. By Prashanth Tilleti

Binary Number System. 16. Binary Numbers. Base 10 digits: Base 2 digits: 0 1

M6800. Assembly Language Programming

Lecture 8: Binary Multiplication & Division

Attention: This material is copyright Chris Hecker. All rights reserved.

To convert an arbitrary power of 2 into its English equivalent, remember the rules of exponential arithmetic:

This 3-digit ASCII string could also be calculated as n = (Data[2]-0x30) +10*((Data[1]-0x30)+10*(Data[0]-0x30));

CS321. Introduction to Numerical Methods

RN-coding of Numbers: New Insights and Some Applications

Recall the process used for adding decimal numbers. 1. Place the numbers to be added in vertical format, aligning the decimal points.

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8

Binary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal.

Numerical Matrix Analysis

RN-Codings: New Insights and Some Applications

Solution for Homework 2

How do you compare numbers? On a number line, larger numbers are to the right and smaller numbers are to the left.

Paramedic Program Pre-Admission Mathematics Test Study Guide

Arithmetic in MIPS. Objectives. Instruction. Integer arithmetic. After completing this lab you will:

Systems I: Computer Organization and Architecture

Z80 Instruction Set. Z80 Assembly Language

What Every Computer Scientist Should Know About Floating-Point Arithmetic

AN617. Fixed Point Routines FIXED POINT ARITHMETIC INTRODUCTION. Thi d t t d ith F M k Design Consultant

CSI 333 Lecture 1 Number Systems

Multiplying and Dividing Signed Numbers. Finding the Product of Two Signed Numbers. (a) (3)( 4) ( 4) ( 4) ( 4) 12 (b) (4)( 5) ( 5) ( 5) ( 5) ( 5) 20

Session 29 Scientific Notation and Laws of Exponents. If you have ever taken a Chemistry class, you may have encountered the following numbers:

Chapter 7D The Java Virtual Machine

COMPSCI 210. Binary Fractions. Agenda & Reading

Number Representation

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

26 Integers: Multiplication, Division, and Order

3.1. RATIONAL EXPRESSIONS

Lecture 2. Binary and Hexadecimal Numbers

PREPARATION FOR MATH TESTING at CityLab Academy

Binary Representation

1. Convert the following base 10 numbers into 8-bit 2 s complement notation 0, -1, -12

Data Storage 3.1. Foundations of Computer Science Cengage Learning

MULTIPLICATION AND DIVISION OF REAL NUMBERS In this section we will complete the study of the four basic operations with real numbers.

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

CS201: Architecture and Assembly Language

Integer Operations. Overview. Grade 7 Mathematics, Quarter 1, Unit 1.1. Number of Instructional Days: 15 (1 day = 45 minutes) Essential Questions

Binary Numbering Systems

Lecture 11: Number Systems

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Chapter 4 -- Decimals

Numeral Systems. The number twenty-five can be represented in many ways: Decimal system (base 10): 25 Roman numerals:

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

Floating point package user s guide By David Bishop (dbishop@vhdl.org)

Goals. Unary Numbers. Decimal Numbers. 3,148 is s 100 s 10 s 1 s. Number Bases 1/12/2009. COMP370 Intro to Computer Architecture 1

Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs

Number and codes in digital systems

HOMEWORK # 2 SOLUTIO

THE EXACT DOT PRODUCT AS BASIC TOOL FOR LONG INTERVAL ARITHMETIC

Elementary Number Theory and Methods of Proof. CSE 215, Foundations of Computer Science Stony Brook University

Binary Adders: Half Adders and Full Adders

Chapter 8 Integers 8.1 Addition and Subtraction

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Microcontroller Basics A microcontroller is a small, low-cost computer-on-a-chip which usually includes:

Addition Methods. Methods Jottings Expanded Compact Examples = 15

3. Convert a number from one number system to another

COMBINATIONAL CIRCUITS

Copy in your notebook: Add an example of each term with the symbols used in algebra 2 if there are any.

CDA 3200 Digital Systems. Instructor: Dr. Janusz Zalewski Developed by: Dr. Dahai Guo Spring 2012

Q-Format number representation. Lecture 5 Fixed Point vs Floating Point. How to store Q30 number to 16-bit memory? Q-format notation.

Numbering Systems. InThisAppendix...

3 cups ¾ ½ ¼ 2 cups ¾ ½ ¼. 1 cup ¾ ½ ¼. 1 cup. 1 cup ¾ ½ ¼ ¾ ½ ¼. 1 cup. 1 cup ¾ ½ ¼ ¾ ½ ¼

A new binary floating-point division algorithm and its software implementation on the ST231 processor

Numerical Analysis I

A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions

2011, The McGraw-Hill Companies, Inc. Chapter 3

FAST INVERSE SQUARE ROOT

NUMBER SYSTEMS. 1.1 Introduction

THUMB Instruction Set

Solve addition and subtraction word problems, and add and subtract within 10, e.g., by using objects or drawings to represent the problem.

Reduced Instruction Set Computer (RISC)

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Useful Number Systems

Comp 255Q - 1M: Computer Organization Lab #3 - Machine Language Programs for the PDP-8

plc numbers Encoded values; BCD and ASCII Error detection; parity, gray code and checksums

The Little Man Computer

8 Primes and Modular Arithmetic

Quick Reference ebook

MACHINE INSTRUCTIONS AND PROGRAMS

Fast Logarithms on a Floating-Point Device

Transcription:

Representation of Floating Point Numbers in Single Precision IEEE 754 Standard Value = N = (-1) S X 2 E-127 X (1.M) 0 < E < 255 Actual exponent is: e = E - 127 sign 1 8 23 S E M exponent: excess 127 binary integer added mantissa: sign + magnitude, normalized binary significand with a hidden integer bit: 1.M Example: 0 = 0 00000000 0... 0-1.5 = 1 01111111 10... 0 Magnitude of numbers that can be represented is in the range: 2-126 (1.0) to 2 127 (2-2 -23 ) Which is approximately: 1.8 x 10-38 to 3.40 x 10 38 #1 lec #17 Winter99 1-27-2000

Representation of Floating Point Numbers in Double Precision IEEE 754 Standard Value = N = (-1) S X 2 E-1023 X (1.M) 0 < E < 2047 Actual exponent is: e = E - 1023 sign 1 11 52 S E M exponent: excess 1023 binary integer added mantissa: sign + magnitude, normalized binary significand with a hidden integer bit: 1.M Example: 0 = 0 00000000000 0... 0-1.5 = 1 01111111111 10... 0 Magnitude of numbers that can be represented is in the range: 2-1022 (1.0) to 2 1023 (2-2 - 52 ) Which is approximately: 2.23 x 10-308 to 1.8 x 10 308 #2 lec #17 Winter99 1-27-2000

IEEE 754 Format Parameters Single Precision Double Precision p (bits of precision) 24 53 Unbiased exponent e max 127 1023 Unbiased exponent e min -126-1022 Exponent bias 127 1023 #3 lec #17 Winter99 1-27-2000

IEEE 754 Special Number Representation Single Precision Double Precision Number Represented Exponent Significand Exponent Significand 0 0 0 0 0 0 nonzero 0 nonzero Denormalized number 1 1 to 254 anything 1 to 2046 anything Floating Point Number 255 0 2047 0 Infinity 2 255 nonzero 2047 nonzero NaN (Not A Number) 3 1 May be returned as a result of underflow in multiplication 2 Positive divided by zero yields infinity 3 Zero divide by zero yields NaN not a number #4 lec #17 Winter99 1-27-2000

Floating Point Conversion Example The decimal number.75 10 is to be represented in the IEEE 754 32-bit single precision format: Hidden -2345.125 10 = 0.11 2 (converted to a binary number) = 1.1 x 2-1 (normalized a binary number) The mantissa is positive so the sign S is given by: S = 0 The biased exponent E is given by E = e + 127 E = -1 + 127 = 126 10 = 01111110 2 Fractional part of mantissa M: M =.10000000000000000000000 (in 23 bits) The IEEE 754 single precision representation is given by: 0 01111110 10000000000000000000000 S E M 1 bit 8 bits 23 bits #5 lec #17 Winter99 1-27-2000

Floating Point Conversion Example The decimal number -2345.125 10 is to be represented in the IEEE 754 32-bit single precision format: -2345.125 10 = -100100101001.001 2 (converted to binary) Hidden = -1.00100101001001 x 2 11 (normalized binary) The mantissa is negative so the sign S is given by: S = 1 The biased exponent E is given by E = e + 127 E = 11 + 127 = 138 10 = 10001010 2 Fractional part of mantissa M: M =.00100101001001000000000 (in 23 bits) The IEEE 754 single precision representation is given by: 1 10001010 00100101001001000000000 S E M 1 bit 8 bits 23 bits #6 lec #17 Winter99 1-27-2000

Basic Floating Point Addition Algorithm Assuming that the operands are already in the IEEE 754 format, performing floating point addition: Result = X + Y = (Xm x 2 Xe ) + (Ym x 2 Ye ) involves the following steps: (1) Align binary point: Initial result exponent: the larger of Xe, Ye Compute exponent difference: Ye - Xe If Ye > Xe Right shift Xm that many positions to form Xm 2 Xe-Ye If Xe > Ye Right shift Ym that many positions to form Ym 2 Ye-Xe (2) Compute sum of aligned mantissas: i.e Xm2 Xe-Ye + Ym or Xm + Xm2 Ye-Xe (3) If normalization of result is needed, then a normalization step follows: Left shift result, decrement result exponent (e.g., if result is 0.001xx ) or Right shift result, increment result exponent (e.g., if result is 10.1xx ) Continue until MSB of data is 1 (NOTE: Hidden bit in IEEE Standard) (4) Check result exponent: If larger than maximum exponent allowed return exponent overflow If smaller than minimum exponent allowed return exponent underflow (5) If result mantissa is 0, may need to set the exponent to zero by a special step to return a proper zero. #7 lec #17 Winter99 1-27-2000

(1) (2) Start Compare the exponents of the two numbers shift the smaller number to the right until its exponent matches the larger exponent Add the significands (mantissas) Simplified Floating Point Addition Flowchart (3) Normalize the sum, either shifting right and incrementing the exponent or shifting left and decrementing the exponent (4) Overflow or Underflow? Generate exception or return error (5) If mantissa = 0 set exponent to 0 Done #8 lec #17 Winter99 1-27-2000

Floating Point Addition Example Add the following two numbers represented in the IEEE 754 single precision format: X = 2345.125 10 represented as: 0 10001010 00100101001001000000000 to Y =.75 10 represented as: 0 01111110 10000000000000000000000 (1) Align binary point: Xe > Ye initial result exponent = Ye = 10001010 = 138 10 Xe - Ye = 10001010-01111110 = 00000110 = 12 10 Shift Ym 12 10 postions to the right to form Ym 2 Ye-Xe = Ym 2-12 = 0.00000000000110000000000 (2) Add mantissas: Xm + Ym 2-12 = 1.00100101001001000000000 (3) Normailzed? Yes + 0.00000000000110000000000 = 1. 00100101001111000000000 (4) Overflow? No. Underflow? No (5) zero result? No Result 0 10001010 00100101001111000000000 #9 lec #17 Winter99 1-27-2000

IEEE 754 Single precision Addition Notes If the exponents differ by more than 24, the smaller number will be shifted right entirely out of the mantissa field, producing a zero mantissa. The sum will then equal the larger number. Such truncation errors occur when the numbers differ by a factor of more than 2 24, which is approximately 1.6 x 10 7. Thus, the precision of IEEE single precision floating point arithmetic is approximately 7 decimal digits. Negative mantissas are handled by first converting to 2's complement and then performing the addition. After the addition is performed, the result is converted back to sign-magnitude form. When adding numbers of opposite sign, cancellation may occur, resulting in a sum which is arbitrarily small, or even zero if the numbers are equal in magnitude. Normalization in this case may require shifting by the total number of bits in the mantissa, resulting in a large loss of accuracy. Floating point subtraction is achieved simply by inverting the sign bit and performing addition of signed mantissas as outlined above. #10 lec #17 Winter99 1-27-2000

Basic Floating Point Subtraction Algorithm Assuming that the operands are already in the IEEE 754 format, performing floating point addition: Result = X - Y = (Xm x 2 Xe ) - (Ym x 2 Ye ) involves the following steps: (1) Align binary point: Initial result exponent: the larger of Xe, Ye Compute exponent difference: Ye - Xe If Ye > Xe Right shift Xm that many positions to form Xm 2 Xe-Ye If Xe > Ye Right shift Ym that many positions to form Ym 2 Ye-Xe (2) Subtract the aligned mantissas: i.e Xm2 Xe-Ye - Ym or Xm - Xm2 Ye-Xe (3) If normalization of result is needed, then a normalization step follows: Left shift result, decrement result exponent (e.g., if result is 0.001xx ) or Right shift result, increment result exponent (e.g., if result is 10.1xx ) Continue until MSB of data is 1 (NOTE: Hidden bit in IEEE Standard) (4) Check result exponent: If larger than maximum exponent allowed return exponent overflow If smaller than minimum exponent allowed return exponent underflow (5) If result mantissa is 0, may need to set the exponent to zero by a special step to return a proper zero. #11 lec #17 Winter99 1-27-2000

(1) (2) Start Compare the exponents of the two numbers shift the smaller number to the right until its exponent matches the larger exponent Subtract the mantissas Simplified Floating Point Subtraction Flowchart (3) Normalize the sum, either shifting right and incrementing the exponent or shifting left and decrementing the exponent (4) Overflow or Underflow? Generate exception or return error (5) If mantissa = 0 set exponent to 0 Done #12 lec #17 Winter99 1-27-2000

Basic Floating Point Multiplication Algorithm Assuming that the operands are already in the IEEE 754 format, performing floating point multiplication: Result = R = X * Y = (-1) Xs (Xm x 2 Xe ) * (-1) Ys (Ym x 2 Ye ) involves the following steps: (1) If one or both operands is equal to zero, return the result as zero, otherwise: (2) Compute the sign of the result Xs XOR Ys (3) Compute the mantissa of the result: Multiply the mantissas: Xm * Ym Round the result to the allowed number of mantissa bits (4) Compute the exponent of the result: Result exponent = biased exponent (X) + biased exponent (Y) - bias (5) Normalize if needed, by shifting mantissa right, incrementing result exponent. (6) Check result exponent for overflow/underflow: If larger than maximum exponent allowed return exponent overflow If smaller than minimum exponent allowed return exponent underflow #13 lec #17 Winter99 1-27-2000

(1) Start Is one/both operands =0? Simplified Floating Point Multiplication Flowchart Set the result to zero: exponent = 0 (2) (3) (4) (5) Compute sign of result: Xs XOR Ys Multiply the mantissas Round or truncate the result mantissa Compute exponent: biased exp.(x) + biased exp.(y) - bias Normalize mantissa if needed (6) Generate exception or return error Overflow or Underflow? Done #14 lec #17 Winter99 1-27-2000

Floating Point Multiplication Example Multiply the following two numbers represented in the IEEE 754 single precision format: X = -18 10 represented as: 1 10000011 00100000000000000000000 and Y = 9.5 10 represented as: 0 10000010 00110000000000000000000 (1) Value of one or both operands = 0? No, continue with step 2 (2) Compute the sign: S = Xs XOR Ys = 1 XOR 0 = 1 (3) Multiply the mantissas: The product of the 24 bit mantissas is 48 bits with two bits to the left of the binary point: Truncate to 24 bits: (01).0101011000000.000000 hidden (1).01010110000000000000000 (4) Compute exponent of result: Xe + Ye - 127 10 = 1000 0011 + 1000 0010-0111111 = 1000 0110 (5) Result mantissa needs normalization? No (6) Overflow? No. Underflow? No Result 1 10000110 01010101100000000000000 #15 lec #17 Winter99 1-27-2000

IEEE 754 Single precision Multiplication Notes Rounding occurs in floating point multiplication when the mantissa of the product is reduced from 48 bits to 24 bits. The least significant 24 bits are discarded. Overflow occurs when the sum of the exponents exceeds 127, the largest value which is defined in bias-127 exponent representation. When this occurs, the exponent is set to 128 (E = 255) and the mantissa is set to zero indicating + or - infinity. Underflow occurs when the sum of the exponents is more negative than - 126, the most negative value which is defined in bias-127 exponent representation. When this occurs, the exponent is set to -127 (E = 0). If M = 0, the number is exactly zero. If M is not zero, then a denormalized number is indicated which has an exponent of -127 and a hidden bit of 0. The smallest such number which is not zero is 2-149. This number retains only a single bit of precision in the rightmost bit of the mantissa. #16 lec #17 Winter99 1-27-2000

Basic Floating Point Division Algorithm Assuming that the operands are already in the IEEE 754 format, performing floating point multiplication: Result = R = X / Y = (-1) Xs (Xm x 2 Xe ) / (-1) Ys (Ym x 2 Ye ) involves the following steps: (1) If the divisor Y is zero return Infinity, if both are zero return NaN (2) Compute the sign of the result Xs XOR Ys (3) Compute the mantissa of the result: The dividend mantissa is extended to 48 bits by adding 0's to the right of the least significant bit. When divided by a 24 bit divisor Ym, a 24 bit quotient is produced. (4) Compute the exponent of the result: Result exponent = [biased exponent (X) - biased exponent (Y)] + bias (5) Normalize if needed, by shifting mantissa left, decrementing result exponent. (6) Check result exponent for overflow/underflow: If larger than maximum exponent allowed return exponent overflow If smaller than minimum exponent allowed return exponent underflow #17 lec #17 Winter99 1-27-2000

IEEE 754 Error Rounding In integer arithmetic, the result of an operation is well-defined: Either the exact result is obtained or overflow occurs and the result cannot be represented. In floating point arithmetic, rounding errors occur as a result of the limited precision of the mantissa. For example, consider the average of two floating point numbers with identical exponents, but mantissas which differ by 1. Although the mathematical operation is well-defined and the result is within the range of representable numbers, the average of two adjacent floating point values cannot be represented exactly. The IEEE FPS defines four rounding rules for choosing the closest floating point when a rounding error occurs: RN - Round to Nearest. Break ties by choosing the least significant bit = 0. RZ - Round toward Zero. Same as truncation in sign-magnitude. RP - Round toward Positive infinity. RM - Round toward Minus infinity. Same as truncation in integer 2's complement arithmetic. RN is generally preferred and introduces less systematic error than the other rules. #18 lec #17 Winter99 1-27-2000

Floating Point Error Rounding Observations The absolute error introduced by rounding is the actual difference between the exact value and the floating point representation. The size of the absolute error is proportional to the magnitude of the number. For numbers in single Precision IEEE 754 format, the absolute error is less than 2-24. The largest absolute rounding error occurs when the exponent is 127 and is approximately 10 31 since 2-24.2 127 = 10 31 The relative error is the absolute error divided by the magnitude of the number which is approximated. For normalized floating point numbers, the relative error is approximately 10-7 Rounding errors affect the outcome of floating point computations in several ways: Exact comparison of floating point variables often produces incorrect results. Floating variables should not be used as loop counters or loop increments. Operations performed in different orders may give different results. On many computers, a+b may differ from b+a and (a+b)+c may differ from a+(b+c). Errors accumulate over time. While the relative error for a single operation in single precision floating point is about 10-7, algorithms which iterate many times may experience an accumulation of errors which is much larger. #19 lec #17 Winter99 1-27-2000

68000 FLOATING POINT ADD/SUBTRACT (FFPADD/FFPSUB) Subroutine ************************************************************* * FFPADD/FFPSUB * * FAST FLOATING POINT ADD/SUBTRACT * * * * FFPADD/FFPSUB - FAST FLOATING POINT ADD AND SUBTRACT * * * * INPUT: * * FFPADD * * D6 - FLOATING POINT ADDEND * * D7 - FLOATING POINT ADDER * * FFPSUB * * D6 - FLOATING POINT SUBTRAHEND * * D7 - FLOATING POINT MINUEND * * * * OUTPUT: * * D7 - FLOATING POINT ADD RESULT * * * * CONDITION CODES: * * N - RESULT IS NEGATIVE * * Z - RESULT IS ZERO * * V - OVERFLOW HAS OCCURED * * C - UNDEFINED * * X - UNDEFINED * (C) COPYRIGHT 1980 BY MOTOROLA INC. * #20 lec #17 Winter99 1-27-2000

* REGISTERS D3 THRU D5 ARE VOLATILE * * * * CODE SIZE: 228 BYTES STACK WORK AREA: 0 BYTES * * * * NOTES: * * 1) ADDEND/SUBTRAHEND UNALTERED (D6). * * 2) UNDERFLOW RETURNS ZERO AND IS UNFLAGGED. * * 3) OVERFLOW RETURNS THE HIGHEST VALUE WITH THE * * CORRECT SIGN AND THE 'V' BIT SET IN THE CCR. * * * * TIME: (8 MHZ NO WAIT STATES ASSUMED) * * * * COMPOSITE AVERAGE 20.625 MICROSECONDS * * * * ADD: ARG1=0 7.75 MICROSECONDS * * ARG2=0 5.25 MICROSECONDS * * * * LIKE SIGNS 14.50-26.00 MICROSECONDS * * AVERAGE 18.00 MICROSECONDS * * UNLIKE SIGNS 20.13-54.38 MICROCECONDS * * AVERAGE 22.00 MICROSECONDS * * * * SUBTRACT: ARG1=0 4.25 MICROSECONDS * * ARG2=0 9.88 MICROSECONDS * * * * LIKE SIGNS 15.75-27.25 MICROSECONDS * * AVERAGE 19.25 MICROSECONDS * * UNLIKE SIGNS 21.38-55.63 MICROSECONDS * * AVERAGE 23.25 MICROSECONDS * * #21 lec #17 Winter99 1-27-2000

************************ * SUBTRACT ENTRY POINT * ************************ FFPSUB MOVE.B D6,D4 TEST ARG1 BEQ.S FPART2 RETURN ARG2 IF ARG1 ZERO EOR.B #$80,D4 INVERT COPIED SIGN OF ARG1 BMI.S FPAMI1 BRANCH ARG1 MINUS * + ARG1 MOVE.B D7,D5 COPY AND TEST ARG2 BMI.S FPAMS BRANCH ARG2 MINUS BNE.S FPALS BRANCH POSITIVE NOT ZERO BRA.S FPART1 RETURN ARG1 SINCE ARG2 IS ZERO ******************* * ADD ENTRY POINT * ******************* FFPADD MOVE.B D6,D4 TEST ARGUMENT1 BMI.S FPAMI1 BRANCH IF ARG1 MINUS BEQ.S FPART2 RETURN ARG2 IF ZERO * + ARG1 MOVE.B D7,D5 TEST ARGUMENT2 BMI.S FPAMS BRANCH IF MIXED SIGNS BEQ.S FPART1 ZERO SO RETURN ARGUMENT1 #22 lec #17 Winter99 1-27-2000

* +ARG1 +ARG2 * -ARG1 -ARG2 FPALS SUB.B D4,D5 TEST EXPONENT MAGNITUDES BMI.S FPA2LT BRANCH ARG1 GREATER MOVE.B D7,D4 SETUP STRONGER S+EXP IN D4 * ARG1EXP <= ARG2EXP CMP.B #24,D5 OVERBEARING SIZE BCC.S FPART2 BRANCH YES, RETURN ARG2 MOVE.L D6,D3 COPY ARG1 CLR.B D3 LSR.L D5,D3 CLEAN OFF SIGN+EXPONENT SHIFT TO SAME MAGNITUDE MOVE.B #$80,D7 FORCE CARRY IF LSB-1 ON ADD.L D3,D7 ADD ARGUMENTS BCS.S FPA2GC BRANCH IF CARRY PRODUCED FPARSR MOVE.B D4,D7 RESTORE SIGN/EXPONENT RTS RETURN TO CALLER #23 lec #17 Winter99 1-27-2000

* ADD SAME SIGN OVERFLOW NORMALIZATION FPA2GC ROXR.L #1,D7 ADD.B #1,D4 SHIFT CARRY BACK INTO RESULT ADD ONE TO EXPONENT BVS.S FPA2OS BRANCH OVERFLOW BCC.S FPARSR BRANCH IF NO EXPONENT OVERFLOW FPA2OS MOVEQ #-1,D7 CREATE ALL ONES SUB.B #1,D4 MOVE.B D4,D7 BACK TO HIGHEST EXPONENT+SIGN REPLACE IN RESULT * OR.B #$02,CCR SHOW OVERFLOW OCCURRED DC.L RTS $003C0002 ****ASSEMBLER ERROR**** RETURN TO CALLER * RETURN ARGUMENT1 FPART1 MOVE.L D6,D7 MOVE.B D4,D7 RTS MOVE IN AS RESULT MOVE IN PREPARED SIGN+EXPONENT RETURN TO CALLER * RETURN ARGUMENT2 FPART2 TST.B D7 RTS TEST FOR RETURNED VALUE RETURN TO CALLER #24 lec #17 Winter99 1-27-2000

* -ARG1EXP > -ARG2EXP * +ARG1EXP > +ARG2EXP FPA2LT CMP.B #-24,D5? ARGUMENTS WITHIN RANGE BLE.S FPART1 NOPE, RETURN LARGER NEG.B D5 MOVE.L D6,D3 CLR.B D7 LSR.L D5,D7 CHANGE DIFFERENCE TO POSITIVE SETUP LARGER VALUE CLEAN OFF SIGN+EXPONENT SHIFT TO SAME MAGNITUDE MOVE.B #$80,D3 FORCE CARRY IF LSB-1 ON ADD.L D3,D7 ADD ARGUMENTS BCS.S FPA2GC BRANCH IF CARRY PRODUCED MOVE.B D4,D7 RTS RESTORE SIGN/EXPONENT RETURN TO CALLER * -ARG1 FPAMI1 MOVE.B D7,D5 TEST ARG2'S SIGN BMI.S FPALS BRANCH FOR LIKE SIGNS BEQ.S FPART1 IF ZERO RETURN ARGUMENT1 #25 lec #17 Winter99 1-27-2000

* -ARG1 +ARG2 * +ARG1 -ARG2 FPAMS MOVEQ #-128,D3 CREATE A CARRY MASK ($80) * ARG1 <= ARG2 EOR.B D3,D5 SUB.B D4,D5 BEQ.S FPAEQ STRIP SIGN OFF ARG2 S+EXP COPY COMPARE MAGNITUDES BRANCH EQUAL MAGNITUDES BMI.S FPATLT BRANCH IF ARG1 LARGER CMP.B #24,D5 COMPARE MAGNITUDE DIFFERENCE BCC.S FPART2 BRANCH ARG2 MUCH BIGGER MOVE.B D7,D4 MOVE.B D3,D7 MOVE.L D6,D3 FPAMSS CLR.B D3 LSR.L D5,D3 SUB.L D3,D7 ARG2 S+EXP DOMINATES SETUP CARRY ON ARG2 COPY ARG1 CLEAR EXTRANEOUS BITS ADJUST FOR MAGNITUDE SUBTRACT SMALLER FROM LARGER BMI.S FPARSR RETURN FINAL RESULT IF NO OVERFLOW #26 lec #17 Winter99 1-27-2000

* MIXED SIGNS NORMALIZE FPANOR MOVE.B D4,D5 FPANRM CLR.B D7 SUB.B #1,D4 SAVE CORRECT SIGN CLEAR SUBTRACT RESIDUE MAKE UP FOR FIRST SHIFT CMP.L #$00007FFF,D7? SMALL ENOUGH FOR SWAP BHI.S FPAXQN BRANCH NOPE SWAP.W D7 SHIFT LEFT 16 BITS REAL FAST SUB.B #16,D4 MAKE UP FOR 16 BIT SHIFT FPAXQN ADD.L D7,D7 DBMI EOR.B D4,D5 SHIFT UP ONE BIT D4,FPAXQN DECREMENT AND BRANCH IF POSITIVE? SAME SIGN BMI.S FPAZRO BRANCH UNDERFLOW TO ZERO MOVE.B D4,D7 RESTORE SIGN/EXPONENT BEQ.S FPAZRO RETURN ZERO IF EXPONENT UNDERFLOWED RTS RETURN TO CALLER * EXPONENT UNDERFLOWED - RETURN ZERO FPAZRO MOVEQ.L #0,D7 CREATE A TRUE ZERO RTS RETURN TO THE CALLER #27 lec #17 Winter99 1-27-2000

* ARG1 > ARG2 FPATLT CMP.B #-24,D5? ARG1 >> ARG2 BLE.S FPART1 RETURN IT IF SO NEG.B D5 ABSOLUTIZE DIFFERENCE MOVE.L D7,D3 MOVE ARG2 AS LOWER VALUE MOVE.L D6,D7 SETUP ARG1 AS HIGH MOVE.B #$80,D7 SETUP ROUNDING BIT BRA.S FPAMSS PERFORM THE ADDITION * EQUAL MAGNITUDES FPAEQ MOVE.B D7,D5 SAVE ARG1 SIGN EXG.L D5,D4 MOVE.B D6,D7 SUB.L D6,D7 SWAP ARG2 WITH ARG1 S+EXP INSURE SAME LOW BYTE OBTAIN DIFFERENCE BEQ.S FPAZRO RETURN ZERO IF IDENTICAL BPL.S FPANOR BRANCH IF ARG2 BIGGER NEG.L D7 CORRECT DIFFERENCE TO POSITIVE MOVE.B D5,D4 USE ARG2'S SIGN+EXPONENT BRA.S FPANRM AND GO NORMALIZE END * (C) COPYRIGHT 1980 BY MOTOROLA INC. * #28 lec #17 Winter99 1-27-2000