CDA 3101: Introduction to Computer Hardware and Organization. Supplementary Notes

Transcription

1 CDA 3: Introduction to Computer Hardware and Organization Supplementary Notes Charles N. Winton Department of Computer and Information Sciences University of North Florida Jacksonville, FL Levels of organization of a computer system: a) Electronic circuit level b) Logic level - combinational logic*, sequential logic*, register-transfer logic* c) Programming level - microcode programming*, machine/assembly language programming, high-level language programming d) Computer systems level - systems hardware* (basic hardware architecture and organization - memory, CPU (ALU, control unit), I/, bus structures), systems software, application systems * topics discussed in these notes Objectives: Understand computer organization and component logic Boolean algebra and truth table logic Integer arithmetic and implementation algorithms IEEE floating point standard and floating point algorithms Register contruction Memory construction and organization Register transfer logic CPU organization Machine language instruction implementation Develop a foundation for Computer architecture Microprocessor interfacing System software Sections: combinational logic sequential logic computer architecture 25

2 Contents Section I - Logic Level: Combinational Logic... Table of binary operations... 3 Graphical symbols for logic gates... 4 Representing data s complement representation... Gray code... 5 Boolean algebra... 6 Canonical forms Σ and Π notations NAND-NOR conversions Circuit analysis Circuit simplification: K-maps Circuit design Gray to binary decoder BCD to 7-segment display decoder Arithmetic circuits AOI gates Decoders/demultiplexers Multiplexers Comparators Quine-McCluskey procedure Section II - Logic Level: Sequential Logic... 5 Set-Reset (SR) latches... 5 Edge-triggered flip-flops An aside about electricity (Ohms s Law, resistor values, batteries, AC) D-latches and D flip-flops T flip-flops and JK flip-flops... 6 Excitation controls... 6 Registers Counters Sequential circuit design finite state automata Counter design... 7 Moore and Mealy circuits Circuit analysis Additional counters Barrel shifter Glitches and hazards Constructing memory International Unit Prefixes (base ) Circuit implementation using ROMs Hamming Code Section III Computer Systems Level Representing numeric fractions IEEE 754 Floating Point Standard Register transfer logic... Register transfer language (RTL)... 2 UNF RTL... 6 Signed multiply architecture and algorithm... 2 Booth s method... 4 Restoring and non-restoring division... 7 Implementing floating point using UNFRTL Computer organization Control unit Arithmetic and Logic unit CPU registers... 3 Single bus CPU organization... 3 Microcode signals Microprograms Branching Microcode programming Other machine language instructions Index register... 4 Simplified Instructional Computer (SIC) Architectural enhancements CPU-memory synchronization Inverting microcode Vertical microcode Managing the CPU and peripheral devices The Z

3 Page Logic level: Combinational Logic Combinational logic is characterized by functional specifications using only binary valued inputs and binary valued outputs r input variables X combinational... logic... s output variables Z Z=f(X) (Z is a function of X) Remark: for given values of r and s, the number of possible functions is finite since both the domain and the range of functions are finite, of size 2 r and 2 s respectively (this is because the r input variables and the s output variables assume only the binary values and ). Although finite, it is worth noting that in practice the number of functions is usually quite large: For example, for r = 5 input variables and s = output variable, the domain consists of the 2 5 = 32 possible input combinations of the two binary input values and. To specify a function, each of these 32 possible input combinations must be assigned a value in the range, which consists of the two binary output values and. This yields 2 32 = 4 billion such functions of 5 variables! In general, with r input variables and s output variables, the domain consists of the k = 2 r combinations of the binary input values. The range consists of the j = 2 s combinations of the binary output values. To specify a function, each of the j input combinations must be assigned to of k possible values in the range. Since there are j k possible ways to do this, there are j k functions having r inputs and s outputs. Each such function corresponds to a logic circuit having r (binary-valued) inputs and s (binary-valued) outputs. When r = 2 input variables and s = output variable, there are 2 4 = 6 possible functions (circuits), each having the basic appearance X Y f Z = f(x,y) Recall that functions of 2 variables are called binary operations. For the usual algebra of numbers these include the familiar operations of addition, subtraction, multiplication, and division and as many more as we might care to define.

4 Page 2 For circuit logic, the input variables are restricted to the values and, so there are only 4 possible input combinations of X and Y, yielding exactly 6 possible binary operations. The corresponding logic circuits provide fundamental building blocks for more complex logic circuits. Such fundamental circuits are termed logic gates. Since there are only 6 of them, they can be listed out - see overleaf. They are named for ease of reference and to reflect common terminology. It should be noted that some of the binary operation are "degenerate." In particular, Zero(X,Y) and One(X,Y) depend on neither X nor Y to determine their output; X(X,Y) and NOT X(X,Y) have output determined strictly by X; Y(X,Y) and NOT Y(X,Y) have output determined strictly by Y. X and NOT X operations (or Y and NOT Y, for that matter) are usually thought of as unary operations (functions of variable) rather than degenerate binary operations. As unary operations they are respectively termed the "identity" and the "complement".

5 Page 3 TABLE OF BINARY OPERATIONS Inhibit X Inhibit Y X Y Zero AND on Y= X on X= Y XOR OR NOR COINC NOT Y Y X NOT X X Y NAND One

6 The complement (or NOT) is designated by an overbar; e.g., complement of X. X is the Page 4 The other most commonly employed binary operations for combinational logic also have notational designations; e.g., AND is designated by, e.g., X Y OR is designated by +, e.g., X + Y NAND is designated by, e.g., X Y NOR is designated by, e.g., X Y XOR is designated by, e.g., X Y COINCIDENCE is designated by u, e.g., X u Y. Note that if we form the simple composite function f (NOT f, or the complement of f), that f(x) = f( X) and = f = f Moreover, X Y = X Y = X Y(NAND NOT AND) - Sheffer stroke X Y = X + Y (NOR = NOT OR) - Pierce arrow X u Y = X Y (COINC = complement of XOR) In particular, NAND and AND, OR and NOR, XOR and COINC are respectively complementary in the sense that each is respectively the complement of the other. Rather than use a general graphical "logic gate" designation X Y Z = f(x,y) ANSI (American National Standards Institute) has standardized on the following graphical symbols for the most commonly used logic gates. AND ( ) NAND ( ) XOR ( ) OR (+) NOR ( ) COINC (u) NOT

7 Page 5 Composite functions such as f(g(x)) can be easily represented using these symbols; e.g., consider the composite f(a,b,c,d) = ((AB) C)u((A C) D) This is easily represented as a 3-level circuit diagrammed by: A B C.. f(a,b,c,d) D The level of a circuit is the maximal number of gates an input signal has to travel through to establish the circuit output. Normally, both an input signal and it's inverse are assumed to be available, so the NOT gate on B does not count as a 4 th level for the circuit. Note that the behavior of the above circuit can be totally determined by evaluating its behavior for each possible input combination (we'll return to determining its values later): A B C D f(a,b,c,d) Note that this table provides an exhaustive specification of the logic circuit more compactly given by the above algebraic expression for f. Its form corresponds to the "truth" tables used in symbolic logic. For small circuits, the truth table form of specifying a logic function is often used. The inputs to a logic circuit typically represent data values encoded in a binary format as a sequence of 's and 's. The encoding scheme may be selected to facilitate manipulation of the data. For example, if the data is numeric, it is normally encoded to facilitate performing arithmetic operations. If the data is alphabetic

8 Page 6 characters, it may be encoded to facilitate operations such as sorting. There are also encoding schemes to specifically facilitate effective use of the underlying hardware. A single input line is normally used to provide a single data bit of information to a logic circuit, representing the binary values of or. At the hardware level, and are typically represented by voltage levels; e.g., by voltage L ("low") and by voltage H ("high"). For the TTL (Transistor-Transistor Logic) technology, H = +5V and L = OV (H is also referenced as V cc - "common cathode" and L as GND or "ground"). Representing Data There are three fundamental types of data that must be considered: logical data (the discrete truth values - True and False) numeric data (the integers and real numbers) character data (the members of a defined finite alphabet) Logical data representation: There is no imposed standard for representing logical data in computer hardware and software systems, but a single data bit is normally used to represent a logical data item in the context, of logic circuits, with "True" represented by and "False" by. This is the representation implicitly employed in the earlier discussion of combinational logic circuits, which are typically implementations of logic functions described via the mechanisms of symbolic logic. If the roles of and are reversed ( representing True and representing False), then the term negative logic is used to emphasize the change in representation for logical data. Numeric data: The two types of numeric data, integers real numbers are represented very differently. The representation in each case must deal with the fact that a computing environment is inherently finite. Integers: When integers are displayed for human consumption we use a "base representation. This requires us to establish characters which represent the base digits. Since we have ten fingers, the natural human base is ten and the Arabic characters,, 2, 3, 4, 5, 6, 7, 8, 9 are used to represent the base digits. Since logic circuits deal with binary inputs ( or ), the natural base in this context is two. Rather than invent new characters, the first two base ten characters ( and )

9 Page 7 are used to represent the base two digits. Any integer can be represented in any base, so long as we have a clear understanding of which base is being used and know what characters represent its digits. For example, 9 indicates a base ten representation of nineteen. In base two it is represented by 2. When dealing different bases, it is important to be able to convert from the representation in one base to that of the other. Note that it is easy to convert from base 2 to base, since each base 2 digit can be thought of as indicating the presence or absence of a power of 2. 2 = = = 9 = + 9 A conversion from base to base 2 is more difficult but still straight forward. It can be handled "bottom-up" by repeated division by 2 until a quotient of is reached, the remainders determining the powers of 2 that are present: 9/2 = 9 R (2 is present) 9/2 = 4 R (2 is present) 4/2 = 2 R (2 2 is not present) 2/2 = R (2 3 is not present) /2 = R (2 4 is present) The conversion can also be handled "top-down" by iteratively subtracting out the highest power of 2 present until a difference of is reached: 9-6 = 3 () (6=2 4 is present so remove 6) no 8's () ( 8=2 3 is not present in what's left) no 4's () ( 4=2 2 is not present) 3-2 = () ( 2=2 is present so remove 2) - = () ( =2 is present in what's left) Bases which are powers of 2 are particularly useful for representing binary data since it is easy to convert to and from among them. The most commonly used are base 8 (octal) which uses as base digits,, 2, 3, 4, 5, 6, 7 and base 6 (hexadecimal) which uses as base digits,, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F where A, B, C, D, E, F are the base digits for ten, eleven, twelve, thirteen, fourteen, and fifteen. An n-bit binary item can easily be viewed in the context of any of base 2, base 8, or base 6 simply by appropriately grouping the bits; for example, the 28 bit binary item

10 Page C D 5 C 4 is easily seen to be = CD5C4 6 when the bits are grouped as indicated (using a calculator that handles base conversions, you can determine that the base ten value is ; note that such calculators are typically limited to ten base 2 digits, but handle 8 hexadecimal digits, effectively extending the range of the calculator to 32 bits when the hexadecimal digits are viewed as 4-bit chunks). Since it is easier to read a string of hexadecimal (hex) digits than a string of 's and 's, and the conversion to and from base 6 is so straightforward, digital information of many bits is frequently displayed using hex digits (or sometimes octal, particularly for older equipment). Since digital circuits generally are viewed as processing binary data, a natural way to encode integers for the use by such circuits is to use fixed blocks of n bits each; in particular, 32-bit integers are commonly used (i.e., n = 32). In general, an n-bit quantity may be viewed as naturally representing one of the 2 n integers in the range [, 2 n- ] in its base 2 form. For example, for n = 5, there are 2 5 = 32 such numbers. The 5-bit representations of these numbers in base 2 form are 2 = l 2 =... 2 = 3 Note that as listed, the representation does not provide for negative numbers. One strategy to provide for negative numbers is to mimic the "sign-magnitude" approach normally used in everyday base representation of integers. For example, -273 explicitly exhibits as separate entries the sign and the magnitude of the number. A sign-magnitude representation strategy could use the first bit to represent the sign ( for +, for -). While perhaps satisfactory for everyday paper and pencil use, this strategy has awkward characteristics that weigh against it. First of all, the operation of subtraction is algorithmically vexing even for base paper and pencil exercises. For example, the subtraction problem is typically handled not by subtracting 34 from 23, but by first subtracting 23 from 34, exactly the opposite of what the problem is asking for! Even worse, is represented twice (e.g., when n = 5, is represented by both and ). Conceptually, the subtraction problem above can be viewed as the addition problem 23 + (-34 ). However, adding the corresponding sign-magnitude

11 Page 9 representations as base 2 quantities will yield an incorrect result in many cases. Since numeric data is typically manipulated computationally, the representation strategy should facilitate, rather than complicate, the circuitry designed to handle the data manipulation. For these reasons, when n bits are used, the resulting 2 n binary combinations are viewed as representing the integers modulo 2 n, which inherently provides for negative integers and well-defined arithmetic (modulo 2 n ). The last statement needs some explanation. First observe that in considering the number line truncation of the binary representation for any non-negative integer i to n bits results in i mod 2 n. Note that an infinite number of non-negative integers (precisely 2 n apart from each other) truncate to a given particular value in the range [, 2 n -]; i.e., there are 2 n such groupings, corresponding to,, 2,..., 2 n -. Negative integers can be included in each grouping simply by taking integers 2 n apart without regard to sign. These groupings are called the "residue classes modulo 2 n. Knowing any member of a residue class is equivalent to knowing all of them (just adjust up or down by multiples of 2 n to find the others, or for non-negative integers truncate the base 2 representation at n bits to find the value in the range [, 2 n -]). In other words, the 2 n residue classes represented by,, 2,..., 2 n - provide a (finite) algebraic system that inherits its algebraic properties from the (infinite) integers, which justifies the viewpoint that this is a natural way to represent integer data in the context of a finite environment. Note that negative integers are implicitly provided for algebraically, since each algebraic entity (residue class) has an inverse under addition. For example, with n = 5, adding the mod 2 5 residue classes for 7 and 25 yields [25 ] + [7 ] = [32 ] = [ ], so [25 ] = [-7 ] Returning to the computing practice point of view of identifying the residue classes with the 5-bit representations of,, 2,..., in base 2 form, the calculation becomes = 2 (truncated to 5 bits). The evident extension of this observation is that n-bit base 2 addition conforms exactly to addition modulo 2 n, a fact that lends itself to circuit implementation. Again referring to the number line consider for n = 5 the following table exhibiting in base the 32 residue classes modulo 2 5. Each residue class is matched to the 5 bit

12 Page representation corresponding to its base value in the range,, 2,..., 3: 5-bit residue class representation {..., -32,, 32,... } = [] {..., -3,, 33,... } = [] {..., -3, 2, 34,... } = [2]... {..., -7, 5, 47,... } = [5] {..., -6, 6, 48,... } = [6] = [-6]... {..., -2, 3, 62,... } = [3] = [-2] {..., -, 3, 63,... } = [3] = [-] Evidently, the 5-bit representations with a leading viewed as base 2 integers best represent the integers,,..., 5. The 5-bit representations with a leading best represent -6, -5,..., -2, -. This representation is called the 5-bit 2's complement representation. It provides for, 5 positive integers, and 6 negative integers. Since data normally originates in sign-magnitude form, an easy means is needed to convert to/from the sign-magnitude form. An examination of the table leads to the conclusion that finding the magnitude for a negative value in 5-bit 2's complement form can be accomplished by subtracting from 32 ( ) and truncating the result. In general, this follows from the mod 2 5 residue class equivalences, -[i] = [-i] + [ ] = [-i] + [32 ] = [-i + 32 ] = [32 -i] which demonstrates that subtracting from 32 and truncating the result will always result in the representation for -i. -i is called the 2's complement of i. One way to subtract from 32 is to subtract from (which is 3) and then add (all in base 2). This is equivalent to inverting each bit and then adding (in base 2) to the overall result. There is nothing special in this discussion that requires 5 bits; i.e., the same rationale is equally applicable to an n-bit environment. Hence, in general, to find the 2's complement of an integer represented in n-bit 2's complement form, invert its bits and add (in base 2). Example : Determine the 8-bit 2's complement representation of -37. First, the magnitude of -37 is given by 37 = 2 which is in 8-bit 2's complement form. The representation for -37 is then given by the 2's complement of 37, obtained by inverting the bits of the 8-bit representation of the magnitude and adding ; i.e.,

13 Page = in 8-bit 2's complement form. Example 2: Determine the (base ) value of the 9 bit 2's complement integers i = j = s = i + j For i, since the lead bit is, the sign is + and the magnitude of the number is directly given by its representation as a base 2 integer; i.e., i = 27. For j, since the lead bit is, the number is negative, so its magnitude is given by -j. Inverting j's bits and adding gives + = 38 = -j (j's magnitude); i.e., j = -38. i+j (which we now know is - ) can be computed directly using ordinary base 2 addition modulo 2 9 ; i.e., i: = 27 j: + = -38 i+j: = - Example 2 illustrates that only circuitry for base 2 addition needs to be developed to perform addition and subtraction on integers represented in n-bit 2's complement form. Historically, a variation closed related to n-bit 2's complement, namely, n-bit 's complement has also been used for integer representation in computing devices. The 's complement of an n-bit block of 's and 's is obtained by inverting each bit. For this representation, arithmetic still requires only addition, but whenever there is a carry out of the sign position (and no overflow has occurred), must be added to the result (a so-called "end-around carry", something easily achieved at the hardware level). For example, in 8-bit 's complement 38 = -27 = = (end-around carry of carry-out) Note that the end-around carry is only used when working in 's complement. Integers do not have to be represented in n-bit blocks. Another representation format is Binary Coded Decimal (BCD), where each

14 Page 2 decimal digit of the base representation of the number is separately represented using its 4-bit binary (base 2) form. The 4-bit forms are = = 2 =... 9 = so in BCD, 27 is represented in 8 bits by is represented in 2 bits by 8 3 BCD is obviously a base representation strategy. It has the advantage of being close to a character representation form (discussed below). When used in actual implementation, it is employed in sign-magnitude form (the best known of which is IBM's packed decimal form, which maintains the sign in conjunction with the last digit to accommodate the fact that the number of bits varies from number to number). Since there is no clear choice as to how to represent the sign, we will not address the sign-magnitude form further in the context of discussing BCD. It is possible to build BCD arithmetic circuitry, but it is more complex than that used for 2's complement. The arithmetic difficulties associated with BCD can easily be seen by considering what happens when two decimal digits are added whose sum exceeds 9. For example, adding 9 and 4 using ordinary base 2 yields = 9 = 4 = 3 which differs from, which is 3 in BCD. 3 Achieving the correct BCD result from the base 2 result requires adding a correction (+6 = 2 ); e.g., + = 3 in BCD. In general, a correction of 6 is required whenever the sum of the two digits exceeds 9. Hence, the circuitry has to allow for the fact that

15 Page 3 sometimes a correction factor is required and sometimes not. Since a BCD representation is normally handled using sign-magnitude, subtraction is an added problem to cope with. Real numbers: Real numbers are normally represented in a format deriving from the idea of the decimal expansion, which is used in paper and pencil calculations to provide rational approximations to real numbers (this is termed a "floating point representation, since the base point separating the integer part from the fractional part may shift as operations are performed on the number). There is a defined standard for representing real numbers, the IEEE 754 Floating Point Standard, whose discussion will be deferred until later due to its complexity. An alternate representation for real numbers is to fix the number of allowed places after the base point (a so-called "fixed point representation ) and use integer arithmetic. Since the number of places is fixed, the base point does not need to be explicitly represented (i.e., it is an "implied base point"). The result of applying arithmetic operations such as multiplication and division typically requires the use of additional (hidden) positions after the base point to accurately represent the result since a fixed point format truncates any additional positions resulting from multiplication or division. For this reason precision is quickly lost, further limiting the practicality of using this format. Character representation: Character data is defined by a finite set, its alphabet, which provides the character domain. The characters of the alphabet are represented as binary combinations of 's and 's. If 7 (ordered) bits are used, then the 7 bits provide 28 different combinations of 's and 's. Thus 7 bits provide encodings for an alphabet of up to 28 characters. If 8 bits are employed, then the alphabet may have as many as 256 characters. There are two defined standards in use in this country for representing character data: ASCII (American Standard Code for Information Interchange) EBCDIC (Extended Binary Coded Decimal Interchange Code). ASCII has a 7-bit base definition, and an 8-bit extended version providing additional graphics characters. (table page 2) In each case the standard prescribes an alphabet and its representation. Both standards have representation formats that make conversion from character form to BCD easy (for each character representing a decimal digit, the last 4 bits are its BCD representation). The representation is chosen so that when viewed in numeric ascending order, the corresponding characters follow the desired ordering for the defining alphabet, which means a numeric sort procedure can also be used for character sorting needs. Since character strings typically encompass many bits, character data is usually represented using hex digits rather than binary.

16 Page 4 For example, the text string "CDA 3" is represented by and C 3 C 4 C 4 F 3 F F F in EBCDIC C D A spc in ASCII (or ASCII-8). Since characters are the most easily understood measure for data capacity, an 8-bit quantity is termed a byte of storage and data storage capacities are given in bytes rather than bits or some other measure. 2 = 24 bytes is called a K-byte, 2 2 =,48,576 bytes is called a megabyte, 2 3 bytes is called a gigabyte, 2 2 bytes is called a terabyte, and so forth. Other representation schemes: BCD is an example of a weighted representation scheme that utilizes the natural weighting of the binary representation of a number; i.e., w 3 d 3 + w 2 d 2 + w d + w d where the digits d i are just or and the weights are w 3 =8, w 2 =4, w =2, w =. Since only of the possible 6 combinations are used, w 3 is for all but 2 cases (8 and 9). A variation uses w 3 =2 to form what is known as "242 BCD". w 3 = for,,2,3,4 and w 3 = for 5,6,7,8,9. A major advantage over regular BCD is that the code is "selfcomplementing" in the sense that flipping the bits produced the 9's complement. Example: subtraction by using addition a subtraction such as is awkward because of the need to borrow. The computation can be done by using addition if you think in terms of 654+(999-47)-999 = = = 83+ = = 529 is called the "9's complement" of 47, so the algorithm to do a subtraction A-B is. form the 9's complement (529) of the subtrahend B (47) 2. add it to the minuend A (654) 3. discard the carry and add (corresponding to the end-around carry of 's complement) Note that no subtraction circuitry is needed, but the technique does need an easy way to get the 9's complement. With 242 BCD, 47 = and the 9's complement of 47 is 529 = Addition is still complicated as can be seen by adding 6+5 which is + = carry (i.e., ordinary binary addition fails). A final BCD code, "excess-3 BCD", is also self-complementing. It is simply ordinary BCD + 3, so for the above example, with excess-3, 47 = and the 9's complement of 47 is 529 =.

17 Page 5 The lesson to learn is that codes must be formulated to represent data in a computer, and different representations are employed for different purposes; e.g., 2's complement is a number representation that facilitates arithmetic in base 2 BCD is another number representation that facilitates translation of numbers to decimal character form but complicates arithmetic ASCII represents characters in a manner that facilitates uppercase/lower-case adjustment and ease of conversion of decimal characters Other schemes such as "242 BCD" and "excess-3 BCD" seek to improve decimal arithmetic by facilitating use of 9's complement to avoid subtraction Sometimes representation schemes are designed to facilitate other tasks, such as representing graphical data elements or for tracking. For example, Gray Code is commonly used for identifying sectors on a rotating disk. Gray code is defined recursively by using the rule: to form the n+ bit representation from the n-bit representation preface the n-bit representation by append to this the n-bit representation in reverse order prefaced by Hence, the, 2, and 3-bit representations are Consider three concentric disks shaded as follows:

18 Page 6 The shading provides a gray code identification for 8 distinct wedgeshaped sections on the disk. As the disk rotates from one section to the next, no more than one digit position (represented by shaded and unshaded segments) changes, simplifying the task of determining the id of the next section when going from one section to the next. Note that this is a characteristic of the gray code. In contrast, note that in regular binary for the transition from 3 to 4, to, all 3 digits change, which means hardware tracking the change if this representation was used would potentially face arbitrary intermediate patterns in the transition from section 3 to section 4, complicating the process of to determining that 4 is the id of the next section (e.g., something such as a delay would have to be added to the control circuitry to allow the transition to stabilize). For a disk such as above, a row of 3 reflectance sensors, one for each concentric band, can be used to track the transitions. Boolean algebra: Boolean algebra is the algebra of circuits, the algebra of sets, and the algebra of truth table logic. A Boolean algebra has two fundamental elements, a "zero" and a "one," whose properties are described below. For circuits "zero" is designated by or L (for low voltage) and "one" by or H (for high voltage). For sets, "zero" is the empty set and "one" is the set universe. For truth table logic, "zero" is designated by F (for false) and "one" by T (for true). Just as the algebraic properties of numbers are described in terms of fundamental operations (addition and multiplication), the algebraic properties of a Boolean algebra are described in terms of basic Boolean operations. For circuits, the basic Boolean operations are ones we ve already discussed AND ( ), OR (+), and complement ( ) For sets the corresponding operations are intersection ( ), union ( ), and set complement. For truth table logic they are AND ( ), OR ( ), and NOT (~). Recall that AND and OR are binary operations (an operation requiring two arguments), while complement is a unary operation (an operation requiring one argument).

19 For circuits, also recall that the multiplication symbol is used for AND the addition symbol + is use for OR the symbol for complement is an overbar; i.e., complement of X. X Page 7 designates the The utilization of for AND and + for OR is due to the fact that these Boolean operations have algebraic properties similar to (but definitely not the same as) those of multiplication and addition for ordinary numbers. Basic properties for Boolean algebras (using the circuit operation symbols, rather than those for sets or for symbolic logic) are as follows:. Commutative property: + and are commutative operations; e.g., X + Y = Y + X and X Y = Y X In contrast to operations such as subtraction and division, a commutative operation has a left-right symmetry, permitting us to ignore the order of the operation's operands. 2. Associative property: + and are associative operations; e.g., X + (Y + Z) = (X + Y) + Z and X (Y Z) = (X Y) Z Non-associative operations (such as subtraction and division) tend to cause difficulty precisely because they are nonassociative. The property of associativity permits selective omission of parentheses, since the order in which the operation is applied has no effect on the outcome; i.e., we can just as easily write X + Y + Z as X + (Y + Z) or (X + Y) + Z since the result is the same whether we first evaluate X + Y or Y + Z. 3. Distributive property: distributes over + and + distributes over ; e.g., X (Y + Z) = (X Y) + (X Z) and also X + (Y Z) = (X + Y) (X + Z) With the distributive property we see a strong departure from the algebra of ordinary numbers which definitely does not have the property of + distributing over. The distributive property illustrates a strong element of symmetry that occurs in Boolean algebras, a characteristic known as duality. 4. Zero and one: there is an element zero () and an element one () such that for every X, X + = and X =

20 5. Identity: is an identity for + and is an identity for ; e.g., Page 8 X + = X and X = X for every X 6. Complement property: every element X has a complement that X + X = and X X = X such The complement of is and vice-versa; it can be shown that in general complements are unique; i.e., each element has exactly one complement. 7. Involution property (rule of double complements): for each X, = X = = X 8. Idempotent property: for every element X, X + X = X and X X = X 9. Absorption property: for every X and Y, X + (X Y) = X and X + ( X Y) = X + Y Anything "AND"ed with X is absorbed into X under "OR" with X. Anything "AND"ed with X is absorbed in its entirety under "OR" with X.. DeMorgan property: for every X and Y, X Y = X + Y and X + Y = X Y The DeMorgan property describes the relationship between "AND" and "OR", which with the rule of double complements, allows expressions to be converted from use of "AND"s to use of "OR"s and vice-versa; e.g., X + Y = = X == == = + Y = X Y X Y = = X == == = Y = X + Y Some of these properties can be proven from others (i.e., they do not constitute a minimal defining set of properties for Boolean algebras); for example, the idempotent rule X + X = X can be obtained by the manipulation X + X = X + (X ) = X by the absorption property. The DeMorgan property provides rules for using NANDs and NORs (where NAND stands for "NOT AND" and NOR stands for "NOT OR"). The operation NAND (sometimes called the Sheffer stroke) is denoted by

21 Page 9 X Y = X Y and the operation NOR (sometimes called the Pierce arrow) is denoted by X + Y = X Y Utilizing the rule of double complements and the DeMorgan property, any expression can be written in terms of the complement operation and or the complement operation and. Moreover, since the complement can be written in terms of either or ; i.e., X = X X = X X any Boolean expression can be written solely in terms of either or solely in terms of. This observation is particularly significant for a circuit whose function is represented by a Boolean expression, since this property of Boolean algebra implies that the circuit construction can be accomplished using as basic circuit elements only NAND circuits or only NOR circuits. Note that properties such as commutative and associative are also a characteristic of the algebra of numbers, but others, such as the idempotent and DeMorgan properties are not; i.e., Boolean algebra, the algebra of circuits, has behaviors quite different from what we are used to with numbers. Just as successfully working with numbers requires gaining understanding of their algebraic properties, working with circuits requires gaining understanding of Boolean algebra. In working with numbers, just as we often omit writing the times symbol in formulas, we may omit the AND symbol in formulas. Examples:. There is no cancellation; i.e., XY = XZ does not imply that Y = Z (if it did, the idempotent property XX = X = X would imply that X =!) 2. Complements are unique To see this just assume that Y is also a complement for X; i.e., X + Y = and XY =. AND the st equation through with X to get X X + Y X = X Since X X =, this reduces to Y X = X Similarly, since X + X = and XY =, XY + X Y = Y reduces to X Y = Y Putting the last two lines together we have X = Y 3. The list of properties is not minimal; e.g., Given that the properties other than the idempotent property are true, then it can be shown that the idempotent property is also true as follows: X + X =, so using the distributive property, XX + XX = X which in turn leads to

22 Page 2 XX = X since XX = A similar argument can be used to show that X + X = X Given that the properties other than the absorption property are true, then it can be shown that the absorption property is also true as follows: Since + Y =, X + XY = X, the st absorption criteria Starting from X + X = we get XY + XY = Y Adding X to both sides we get X + XY + XY = X + Y By the first absorption criteria this reduces to X + XY = X + Y, which is the 2 nd absorption criteria The DeMorgan property has great impact on circuit equations, since it provides the formula for converting from OR to NAND and from AND to NOR. The above proofs are by logical deduction. For a 2-element Boolean algebra, proof can be done exhaustively be examining all cases; e.g., we can verify DeMorgan by means of a "truth table": X Y X Y X Y X + Y X + Y This is called a "brute force" method for verifying the equation X + Y = X Y because it exhaustively checks every case using the definition of the AND, OR and NOT operations. Since AND and OR are associative, we can write X Y Z and X + Y + Z unparenthesized. It can be shown that and X Y Z = X X + Y + Z = Y Z + + X Y Z This leads to the "generalized DeMorgan property": X X 2... X n = X + X X n X + X X n = X X 2... X n which is often useful for circuits of more than 2 variables. There are multi-input NAND gates to take advantage of this property. WARNING: NAND and NOR are not associative.

23 Page 2 Consider the truth table: X Y Z X Y Z X Y Y Z X Y Z X Y Z X Y Z (X (Y Z)) ((X Y) Z) It is evident that (X (Y Z)) ((X Y) Z) X Y Z Similarly (X (Y Z)) ((X Y) Z) X + Y + Z This means that care must be taken in grouping the NAND ( ) and NOR ( ) operators in algebraic expressions! The other two common binary operations, XOR ( ) and COINC (u) are both associative. X Y Z X Y Y Z (X Y) Z X (Y Z) XuY YuZ (XuY)uZ Xu(YuZ) Generalized operations (multi-input) serve to reduce the number of levels in a circuit; e.g., a 3 input AND is a -level circuit for XYZ equivalent to the 2-level circuit (XY)Z: 2-level (XY)Z X Y Z -level XYZ X Y Z

24 Page 22 Canonical forms: Any combinational circuit, regardless of the gates used, can be expressed in terms of combinations of AND, OR, and NOT. The most general form of this expression is called a canonical form. There are two types: the canonical sum of products the canonical product of sums Formulating these turns out to be quite easy if the truth table for the circuit is constructed. For example, consider a circuit f(x,y,z) with specification: X Y Z f(x,y,z) X Y Z X Y Z X Y Note that f(x,y,z) = X Y Z + X Y Z + X Y Z Each of these terms is obtained just by looking at the combinations for which f(x,y,z) is. Each of these is call a minterm. There are 8 possible minterms for 3 variables (see below). Analogously, for the combinations for which f(x,y,z) is we get f(x,y,z) = (X+Y+ Z)(X+ Y+Z)(X+ Y+ Z)( X+Y+Z)( X+ Y+ Z) Each of these terms is obtained just by looking at the combinations for which f(x,y,z) is. Each of these is call a maxterm. There are 8 possible maxterms for 3 variables (see below). The minterms and maxterms are numbered from corresponding to the binary combination they represent. X Y Z minterms maxterms. X Y Z X+Y+Z. X Y Z X+Y+ Z 2. X Y Z X +Y+ Z 3. X YZ X+ Y + Z 4. X Y Z X +Y+Z 5. XYZ X+Y+ Z 6. XYZ X+ Y+Z 7. XYZ X+ Y + Z Z

25 Page 23 Note that the maxterms are just the complements of their corresponding minterms. Representing a function by using its minterms is called the canonical sum of products and by using its maxterms the canonical product of sums; i.e., f(x,y,z) = X Y Z + X Y Z + X Y Z is the canonical sum of products and f(x,y,z) = (X+Y+ Z)(X+ Y+Z)(X+ Y+ Z)( X+Y+Z)( X+ Y+ Z) is the canonical product of sums for the function f(x,y,z). The short-hand notation (Σ-notation) f(x,y,z) = Σ(,5,6) is used for the canonical sum of products. Similarly the short-hand notation (Π-notation) f(x,y,z) = Π(,2,3,4,7) is used for the canonical product of sums. Canonical representations are considered to be 2-level representations, since for most circuits a signal and its opposite are both available as inputs. A combinational circuit's behavior is specified by one of truth table listing the outputs for every possible combination of input values canonical representation of the outputs using Σ or Π notation circuit diagram using logic gates Converting to NANDS or NORS: For a Boolean algebra, notice that the complement X is given by (X X) Since XY is given by the complement of (X Y) we have XY = (X Y) (X Y) By DeMorgan X + Y = = X == == = + Y = X Y = (X X)(Y Y) Hence, we can describe and equation using AND, OR, and complement solely in terms of NANDS using the above conversions. Similarly, for NOR we have the conversions - X = (X X) X+Y = (X Y) (X Y) XY = = X = Y = X + Y = (X X)(Y Y) (By DeMorgan) By DeMorgan, a NAND gate is equivalent to ( X Y = X + Y ) and a NOR gate is equivalent to ( X + Y = X Y)

26 Page 24 Using these equivalences, an OR-AND (product of sums) combination can be converted to NOR-NOR as follows: OR-AND NOR-NOR Other equivalences to OR-AND that follow from this one are NAND-AND and AND-OR as follows: NAND-AND NAND-AND For the sum of products (AND-OR) we have the counterpart equivalences: AND-OR NAND-NAND NOR-OR OR-NAND

27 Page 25 At this point, if given a truth table, or a representation using Σ or Π notation, we can generate a 2-level circuit diagram as the canonical sum of products or product of sums. Similarly, given a circuit diagram, we can produce its truth table. This process is called circuit analysis. For example, recall that the circuit equation, f(a,b,c,d) = ((AB) C)u((A C) D) was earlier represented as a 3-level circuit diagrammed by: A. A B (A B ) C B C. A C f(a,b,c,d)= ((AB) C)u((A C) D (A C) D D ) From the circuit equation we can obtain the truth table as follows, conforming to the value given earlier A B C D f(a,b,c,d) AB (AB) C A C (A C) D ((AB) C )u((a C) D) From the truth table f(a,b,c,d) = Σ(,5,,5) = Π(,2,3,4,6,7,8,9,,2,3,4) Note that the canonical representations are not as compact as the original circuit equation. Circuit simplification: A circuit represented in a canonical form (usually by Σ or Π notation) can usually be simplified. There are 3 techniques commonly employed: algebraic reduction Karnaugh maps (K-maps) Quine-McCluskey method

28 Page 26 Algebraic reduction is limited by the extent to which one is able to observe potential combinations in examining the equation; e.g., ABCD + ABCD + ABCD = ABCD + ABCD + ABCD + ABCD (idempotent) = A BD( C + C) + ( A + A)BCD (distributive) = A BD + BCD (complement) = ABD + BCD (identity) This is a minimal 2-level representation for the circuit. The further algebraic reduction to ( A + C )BD produces a 2-level circuit dependent only on 2-input gates. The Quine-McCluskey method is an extraction from the K-map approach abstracted for computer implementation. It is not dependent on visual graphs and is effective no matter the number of inputs. Since it does not lend itself to hand implementation for more than a few variables, it will only be discussed later and in sketchy detail. For circuits with no more than 4 or 5 input variables, K-maps provide a visual reduction technique for effectively reducing a combinational circuit to a minimal form. The idea for K-maps is to arrange minterms whose value is (or maxterms whose value is ) on a grid so as to locate patterns which will combine. For a -variable map, input variable X, the minterm locations are as follows: X X X While a -variable map is not useful, it is worth including to round out the discussion of maps using more variables. For a 2-variable map, input variables X, Y has minterm locations X Y X Y X Y X Y X Y In general we only label the cells according to the binary number they correspond to in the truth table (the number used by the Σ or Π notations). The map structure is then: X Y 2 3

29 For example, if we have f(x,y) = Σ(,3), we mark the minterms for and 3 in the 2-variable map as follows: Page 27 X Y 2 3 Now we can graphically see that a reduction is possible by delineating the adjacent pair of minterms (corresponding to X Y + XY), which in fact reduces to Y. Notice that there are visual clues: the over the column corresponds to Y and the looking down vertically, the and "cancel". 2-variable K-maps also are not particularly useful, but again are illustrative. With 3-variables, the pattern is X YZ The key thing to note is that the order across the top follows the Gray code pattern so that there is exactly one - matchup between each column, including a match between the st and 4 th columns. For the function f(x,y,z) = Σ(,3,4,6), the K-map is X YZ f(x,y,z) = X Z + X Z The st term of the reduced form for f(x,y,z) is in the X row (flagged by ) and the 2 nd is in the X row (flagged by ). In each case the Y term cancels since it is the one with matched to. Pay particular attention to the box that wraps around.

30 Page 28 For a more complex example, consider f(x,y,z) = Σ(,3,4,5) X YZ Here f(x,y,z) can be reduced to either of the following f(x,y,z) = X Z + X Y f(x,y,z) = XZ + XY + Y Z Not that the term Y Z is "redundant" since its 's are covered by the other two terms. The first expression is called a minimal sum of products expression for f(x,y,z) since it cannot be reduced further. For combinational circuits, the redundant term can be omitted, but sometimes in the context of sequential circuits, where intermediate values matter, it must be left in. With 4-variables, the K-map pattern is AB CD Now the Gray code pattern of the rows must also be present for the columns. More complex situations can also arise; for example, AB CD

31 Page 29 describes f(a,b,c,d) = Σ(,2,6,7,8,9,3,5). There are two patterns present that produce a minimal number of terms: AB CD 3 2 AB CD Hence, either of the following produces a minimal sum of produces expression: from the rows f(a,b,c,d) = A B D + A BC + ABD + A B C from the columns f(a,b,c,d) = B C D + ACD + BCD + ACD In either case we know we have the function since all 's are covered. When working with maxterms, the 's of the function are what is considered. For the function above, f(a,b,c,d) = Π(,3,4,5,,,2,4) and the K-map is AB CD leading to the following two minimal product of sums expressions: f(a,b,c,d) = (A+B+ D )(A+ B +C)(A + B +D)( A +B+ C ) from the rows f(a,b,c,d) = ( B +C+D)(A+C+ D )(B+ C + D )( A + C +D) from the columns. Be sure to observe that when working with maxterms, "barred" items correspond to 's and unbarred items correspond to 's, exactly the opposite of what is done when working with minterms. Just as a 4-variable K-map is formed by combining two 3-variable maps, a 5-variable K-map can be formed by combining two 4-variable maps

32 Page 3 (conceptually, on top of the other, representing and for the 5 th variable). In general, blocks of size 2 n are the ones that can be reduced. Here are blocks of size 4 on a 4-variable K-map: AB CD 3 2 AB CD f(a,b,c,d) = AB f(a,b,c,d) = AD AB CD 3 2 AB CD f(a,b,c,d) = B D f(a,b,c,d) = BD In each case, the horizontal term with against is omitted and the vertical term with against is omitted. Be sure to pay particular attention to the pattern with a in each corner, where A is omitted vertically and C is omitted horizontally. Note that each block of 4 contains 4 blocks of 2, but these are not diagrammed since they are absorbed (in contrast, the Quine-McCloskey method, which we won t look at until later, does keep tabs on all such blocks!). In general, an implicant (implicate for 's) is a term that is a product of inputs (including complements) for which the function evaluates to whenever the term evaluates to. These are represented by blocks of size 2n on K-maps.

33 A prime implicant (implicate for 's) is one not contained in any larger blocks of 's. Page 3 An essential prime implicant is a prime implicant containing a not covered by any other prime implicant. A distinguished cell is a -cell covered by exactly prime implicant. A don't care cell is one that may be either or for a particular circuit. The value used in K-map analysis is one which increases the amount of reduction. Don't care conditions occur because in circuits, there are often combinations of inputs that cannot occur, so we don't care whether their values are or. General Procedure for Circuit Reduction Using K-maps. Map the circuit's function into a K-map, marking don't cares by using dashes 2. Treating don't cares as if they were 's ('s for implicates), box in all prime implicants (implicates), omitting any consisting solely of dashes. 3. Mark any distinguished cells with * (dashes don't count) 4. Include all essential prime implicants in the sum, change their 's to dashes and remove their boxes - exit if there aren't any more 's at this point. 5. Remove any prime implicants whose 's are contained in a box having more 's (dominated case) if there is a case where the number of 's is the same (codominant case), discard the smaller box if there is a case where the number of 's is the same and the box sizes are the same, discard either. 6. Go back to step 3 if there are any new distinguished cells 7. Include the largest of the remaining prime implicants in the sum and go back to step 4 (this step is rarely needed) - if there is no largest, choose any 8. If step 7 was used, choose from among the possible sums the one with the fewest terms, then the one using the fewest variables. Remark: if this procedure is employed with the K-map AB CD step 7 will be employed.

34 Page 32 Worked out example: AB CD 3 2 * * There are 2 essential prime implicants to put in the sum: A D + A D Now change the 's in these 2 boxes to don't cares and redraw the map: AB CD The map has 2-sets of co-dominant implicants, so pick one of the codominant boxes from each and delete it; mark distinguished cells. AB CD 3 * - - * * * - Adding in the new essential prime implicants covers all 's so f(a,b,c,d) = A D + A D + B D + A C

35 Page 33 We earlier considered the circuit analysis process, where given a circuit diagram, it can be converted into a circuit equation based on the gates employed, and from there converted into a truth table. The circuit design process proceeds as follows:. Formalize the problem statement into inputs and outputs, devising representations for inputs and outputs 2. Translate the problem statement to a logic function 3. Determine the outputs corresponding to inputs (some of which may be don t cares) 4. Convert to Σ or Π notation (truth table optional), including any don t cares Example: if f(a,b,c) = for,3,4 and,5 are don t cares, then the circuit is given by either of f(a,b,c) = Σ(,3,4) + d(,5) or f(a,b,c) = Π(2,6,7) + d(,5) 5. Create a K-map from the Σ or Π notation 6. Use K-map reduction to obtain a minimal circuit equation 7. Produce a circuit diagram from the circuit equation Employing XOR gates requires manipulation of the circuit equation. Employing NAND and NOR gates can be accomplished by adjusting the circuit diagram. [Recall that using the equivalences a NAND gate is equivalent to and a NOR gate is equivalent to there are diagrammatic techniques for converting sum of products and product of sums expressions to ones using NAND and NOR]. Example: (circuit design) Design a matching circuit for the following: There are 3 types of ball bearings in a bin (plastic, steel, and brass). An assembly machine needs ball bearings of each type at different points in the assembly process. Given the type of ball bearing it needs at present, it needs to look through the bin for a ball bearing matching the type; ie., Needed type Observed type Accept/Reject

36 Page 34 Step : Formalize Type Representation Plastic Accept = Steel Reject = Brass Steps 2,3,4: Translate to logic function Needed obsrv d A B C D f(a,b,c,d) = Σ(5,,5) + d(,,2,3,4,8,2) d = Π(6,7,9,,3,4) + d(,,2,3,4,8,2) d d d d d d Steps 5: K-map reduction AB CD AB CD * * * 5 4 * * 3 9 * 5 4 * Step 6: Circuit equation f(a,b,c,d) = A C + ABCD + B D or f(a,b,c,d) = ( A +C)(B+ D )(A+ C )( B +D) Step 7: Circuit diagram (there are 2 obvious NORs) f(a,b,c,d) = A + C + ABCD + B + D

37 Page 35 A + C A B C D ABCD B + D f(a,b,c,d) Example: (circuit design) Design a combinational circuit to convert 3-bit Gray code to 3-bit binary (this is called a Gray to binary decoder). X A Gray in Y B Binary out Z C X Y Z A B C A = Σ(4,5,6,7), B=Σ(2,3,4,5), C=Σ(,2,4,7) X YZ K-map for A X YZ K-map for B X YZ K-map for C A = X + X Y = X Y B = X Y C = X Y Z + X Y Z + XYZ + XYZ = (XZ + XZ) Y + (XZ+ X Z )Y = (X Z) Y + ( X Z)Y = (X Z) Y = X Y Z Pay particular attention to the patterns that produced the XORs!

38 Page 36 X A Gray in Y B Binary out Z C Gray to Binary Decoder A Gray to binary decoder is an example of a circuit that could be packaged as a specialized circuit. As an example of a more complex decoder, consider the 7-segment display a f b e g c d This are used to produce representations of decimal digits and (to a lesser extent) the hex characters A-F as follows: A B C D E F Pay particular attention to the difference between the representations for 6 and B (a common mistake is to interpret the B pattern as 6). Note that a logic circuit to convert 4-bit (hexa)decimal data to 7- segment display format will require 7 outputs, one for each of segments a,b,c,d,e,f,g. If only a BCD conversion is needed, then the circuit is simplified (somewhat) because for the inputs for A,B,C,D,E,F, the outputs are don t cares. The construction of such a circuit can be achieved by the means already covered, albeit with some tedium due to the number of outputs. The SN7447 chip is a BCD to 7-segment display decoder/driver (LED segments have to be protected from excess current, a capability built in to this chip so that it can directly drive LED segments without use of pull-up resistors). A worked out circuit diagram for this chip follows:

39 Page 37 BI Blanking Input; RBO Ripple Blanking Output; LT Lamp Test; RBI Ripple Blanking Input Points marked are take HIGH by taking the blanking input line LOW (this forces all outputs HIGH) SN 7447: BCD to 7-segment Display Decoder/Driver a b c d e f g A B C D B I RBO ABC Wired AND L T R B I Lamp test A B C D A B D C d g f e c b a BD A C BD A AB A B C D B C CD A B C A BC A B C A B C A B C B C A C D ABC B C D

40 Page 38 BI (Blanking Input), RBI (Ripple Blanking Input), and LT (Lamp Test) have no effect if they are not connected or if their lines are held HIGH. If the blanking input is taken LOW, a is forced at each point marked, in effect blanking all LED by taking their lines high Taking the lamp test input LOW forces the internal lines representing A,B,C to go LOW, which internally produces the same effect as an input of numeric or 8, thus enabling LED lines a,b,c,d,e, and f. LED line g requires an additional enable via the internal lamp test line. Taking the ripple blanking input line LOW enables the six input NAND gates in the circuit to respond to the internal lines representing A, B, C, D, which will then cause the blanking of the LEDs if the numeric value of the input is. To suppress leading s in a sequence of digits, the blanking input line for each digit is used as an output (Ripple Blanking Output) connected to the ripple blanking input line of the digit of next lower order (note that as soon as a non-zero digit occurs in the sequence, it produces a HIGH signal on RBO which will then cause ripple blanking to be disabled for all subsequent lower order digits). Careful examination of the circuit shows that segment a is not lit for the number 6! BCD to 7-segment display function table: D C B A a b c d e f g Remark: the SN7447 display pattern for 6 is given by (non-bcd input combinations are all don t cares) Standard K-map analysis results in the following equations: a = A B C D + A C + BD [BD added in from don t cares for blanking output purposes] b = A B C + A BC + BD [BD added in from don t cares for blanking output purposes] c = A B C + CD d = ABC + A B C + A B C [CD added in from don t cares for blanking output purposes] e f g = A + BC = AB + BC + A C D = ABC + B C D

41 Page 39 Arithmetic circuits: Half adder 2-bit addition is accomplished by XOR. A circuit for 2- bit addition that outputs both the sum (S) and carry (C out ) is called a half adder (a full adder also accounts for an input carry from a prior addition (C in ) X Y S C out X Y S C out Half adder (HA) Full adder To accommodate an input carry we have X Y C in S C out S = X Y C in by the same analysis used for the C output variable of the Gray to binary decoder discussed earlier. C out = X YC in + XYC in + XYC in + XY C in which reduces nicely to = ( XY + XY)C in + XY(C in + C in) = (X Y)C in + XY Both (X Y)C in and XY are produced by two half adders arranged as follows: X Y C in XY S (X Y)C in Hence to get a full adder (FA) we simple use two half-adders with an OR gate applied to the two carries: X Y C in HA HA S C out

42 Page 4 4-bit parallel adder: Input is two 4-bit quantities (X 3,X 2,X,X ) and (Y 3,Y 2,Y,Y ). Input corresponding digits to each full adder circuit and propagate each carry out to the carry in of the next higher full adder. X 3 Y 3 X 2 Y 2 X Y X Y C in FA FA FA FA C out S 3 S 2 S S It is evident that this technique can be extended for multiple bits. The major drawback to this circuit construction is the fact that the carry propagation must go through many circuit levels to reach the high order bit. For this reason, adders may employ carry anticipation; for example, for a 2-bit adder, the C out value can be determined combinationally by examining its specification or simply employing logic; i.e., C out is given by (X AND Y ) OR [carry out via X and Y alone] ((X OR Y ) AND X AND Y ) OR [carry out via carry in from st FA] ((X OR Y ) AND C in AND (X OR Y ) Multiplier: Input is two 3-bit quantities (X 2,X,X ) and (Y 2,Y,Y ). Think in terms of the construction X 2 X X Y 2 Y Y X 2 Y X Y X Y X 2 Y X Y X Y X 2 Y 2 X Y 2 X Y 2 X 2 Y X Y where + 2 is the binary addition accomplished by a full adder. The number of gates for this kind of construction is the reason multiplication circuits may use sequential circuit techniques (to be covered later). Subtraction: Full and half-subtractors can be constructed analogously to full and half-adders. Half subtractor 2-bit subtraction is also accomplished by XOR. A circuit for 2-bit subtraction that outputs both the difference (D) and borrow (B out ) is called a half subtractor (a full subtractor also accounts for an input borrow from a prior subtraction (B in ) X Y D B out X Y D B out Half subtractor (HS)

43 Page 4 Full subtractor To accommodate an input borrow we have X Y B in D B out D = X Y B in by the same analysis used for the C output variable of the Gray to binary decoder discussed earlier. B out = X Y B in + XYB in + X YB in + XYB in which reduces nicely to = ( X Y + XY)B in + X Y(B in + B in) = ( X Y)B in + X Y Both ( X Y)B in and XY are produced by two half subtractors arranged as follows: X Y B in XY D ( X Y )B in Hence to get a full subtractor (FS) we simple use two half-subtractors with an OR gate applied to the two borrows: X Y B in HS HS D B out 4-bit parallel subtractor: Input is two 4-bit quantities (X 3,X 2,X,X ) and (Y 3,Y 2,Y,Y ). Input corresponding digits to each full subtractor circuit and propagate each borrow out to the borrow in of the next higher full subtractor. X 3 Y 3 X 2 Y 2 X Y X Y B in FS FS FS FS B out D 3 D 2 D D

44 Page 42 Just as for the adder circuit, it is evident that this technique can be extended for multiple bits. Note that the difference between the adder and subtractor circuits is in how the propagated signal is dealt with (whether carry or borrow). BCD adder: Recall that BCD addition required adding 6 if the sum exceeded 9. A BCD adder can then be formed by combining a 4-bit binary adder with circuitry to make the adjustment when the sum exceeds 9. Note that the test for 9 or greater is R 3 (R 2 +R +R ). X 3 Y 3 X 2 Y 2 X Y X Y carry in 4-bit binary adder R 3 R 2 R R carry out HA FA HA (add 6) test for result > 9 S 3 S 2 S S BCD Sum Note that when the exceeds 9 test is, the HA,FA,HA combination simply adds in, which has no effect on the sum; otherwise, is added to R 3 R 2 R, in effect adding 6. Other specialized circuits: AOI gates: (AND-OR-Invert) Suppose you have an expression such as (A + B + C)(A + B). Then double-inverting and applying the DeMorgan property, this becomes (A + B + C)(A + B) = ( A B C )+ ( A B ) which is an AND-OR-Invert expression. Hence AOI gates are employed to implement product of sums expressions. A 2-wide, 3-input AOI gate has the form:

45 Page 43 Decoders/demultiplexers: Both the Gray to binary decoder and BCD to 7-segment display decoder/driver constructed earlier are cases of a class of circuits called decoders and demultiplexers. Basically, a decoder translates input data to a different output format. Of particular interest is a decoder that decodes an input address to activate exactly one of several outputs. In particular, a of 2 n decoder is one for which exactly one of 2 n output lines goes High in response to an n-input address. If there is a data input line also, and the selected output matches the data input, then the circuit is called a demultiplexer. Example : of 8 demultiplexer Data in Address in Addressed outputs In essence a demultiplexer routes the input data to the addressed output. Example 2: Constructing a of 6 decoder/demultiplexer from two of 8 decoder/demultiplexers Decoder/demultiplexers usually include a chip select or enable input to activate/deactivate the circuit. With an enable input a larger decoder/demultiplexer can be constructed from smaller ones; for example, a of 6 decoder/demultiplexer can be constructed from two of 8 decoder/demultiplexers as follows: Data in 2 4 CS Addressed outputs Address in CS This kind of construction is very useful for addressing memory.

46 Page 44 A of n decoder can also be used to directly implement a logic function. For example, the specification f(x,y,z) = Σ(2,5,6) can be implemented using a of 8 decoder by X Y Z 2 4 of 8 decoder f(x,y,z) = Σ(2,5,6) Internally, a decoder simply uses AND gates to produce the desired outputs; e.g., a of 4 decoder has the construction Address in 2 3 Addressed outputs So the circuit implementation for f(x,y,z) as implemented above is just a sum of products (in fact, the canonical form since it is just minterms OR ed together). Multiplexers: A multiplexer circuit is the inverse of a demultiplexer and is even more useful for implementing logic circuits because it does not require OR ing of outputs. An 8 input multiplexer has the form Data in Output Output Address in 2 4 CS For a multiplexer, the address refers to the input lines. The output value is that of the addressed input. Normally, both a chip select line and the complement of the output are also provided.

47 Page 45 A 4 input multiplexer (MUX) has the construction: 2 3 Output Output Address in The basic addressing strategy is the same as for a decoder, but for a multiplexer the AND gates are also used to enable (or suppress) input values. Chip select is not implemented above, but can be accomplished by increasing the input capacity of each AND gate, attaching the chip select line to each AND. The OR gate that had to be supplied externally when using a decoder to implement a logic function is now incorporated into the construction. Implementing a logic function using a multiplexer is best illustrated by an example. Suppose that the specification f(a,b,c,d) = Σ(,2,3,,4) is what is given. f(a,b,c,d) can be implemented using an 8-input multiplexer as follows: A B C D f(a,b,c,d) D D D D D D A B C CS f(a,b,c,d) f Note that columns A,B,C select,,..., 7 in pairs, each of which corresponds to one of,d, D, on the output side. This provides a mapping from the truth table to an 8-input MUX as indicated. The SN745 chip is an 8-input MUX commonly used for this purpose.

48 Page 46 Comparators: A comparator takes two input values and reports them as <, =, or >. Starting from the most significant bit, the comparator cascades comparisons until corresponding bits are found that are different (the limiting case is all bits are equal). The first occurrence of corresponding bits that are different determines whether the output should be > or <. The circuit diagram for a 4-bit binary comparator to compare (X 3,X 2,X,X ) to (Y 3,y 2,Y,Y ) is below: X 3 Y 3 < = > The top line is a < test. Each remaining line is an = test for a higher order bit pair. is output if the < test is and all higher order = tests are. < X 2 Y 2 I N < P = U > T s X Y < = > < = A from a prior comparator (< is set in testing higher order pairs) forces the < output to be. = O U T P U T S > > < X Y = > The circuit allows for cascading of comparators, where input from a comparator testing higher order bits may have already determined the outcome. Tracing the circuit strategy as indicated in the annotation shows that it implements the approach sketched out above.

49 Page 47 More specifically, the comparator as given is based on standard comparison logic; i.e., case:. the "<" input line is (the outcome is already "<" based on higher order bits) then the "<" output line will be, the "=" output line will be, and the ">" output line will be 2. the ">" input line is (the outcome is already ">" based on higher order bits) then the "<" output line will be, the "=" output line will be, and the ">" output line will be 3. the "=" input line is (the higher order bits are all "=", so the comparison depends on lower order digits) then if A3 < B3 OR A3 = B3 AND A2 < B2 OR A3 = B3 AND A2 = B2 AND A < B OR A3 = B3 AND A2 = B2 AND A = B AND A < B then the "<" output line will be, the "=" output line will be, and the ">" output line will be else if A3 > B3 OR A3 = B3 AND A2 > B2 OR A3 = B3 AND A2 = B2 AND A > B OR A3 = B3 AND A2 = B2 AND A = B AND A > B then the "<" output line will be, the "=" output line will be, and the ">" output line will be else (the result must be "=") the "<" output line will be, the "=" output line will be, and the ">" output line will be Particular attention should be given to how the logic has been implemented in the circuit diagram. Contrast this to an approach that seeks to work from a truth table specification to a minimal sum of products or product of sums solution.

50 Page 48 Quine-McCluskey procedure: (optional non-graphical approach to reduction) As the number of variables increases, the K-map graphical reduction technique becomes increasingly problematic. The Quine-McCluskey procedure is an algorithmic alternative best employed for computer implementation and is covered for completeness. Step : Lay out the minterms in groups having the same number of s, groups ordered by increasing numbers of s. This is a listing of all blocks of. Step 2: Compare each group to the one immediately below it to form all blocks of 2. Flag each block of when it is used in forming a block of 2. Repeat this process on the blocks of 2 to form all possible blocks of 4, then blocks of 8, and so on. Flag each block when it is used to form a larger block. Any blocks not used in forming larger blocks are carried forward to step 3. Do not list any blocks formed redundantly (e.g., a block of 4 occurs has 4 blocks of 2 and so can be formed 2 different ways) Illustration: A B C D f blocks of blocks of 2 blocks of 4 ) * ) - * ) -- 2) - * 2) -- 2) * 3) - * 3) * 3) -- 4) * 4) - * 4) -- 5) - * 5) -- 5) * 6) - * 6) * 7) - * 6) -- 7) * 8) - * 7) -- 8) * 9) - * 9) * ) - * ) * ) - * ) * 2) - * 3) - * 2) * 4) - * 5) - * 6) - * 7) - * 8) - *

51 Page 49 Step 3: Form the table of minterms and blocks from the first 2 steps. Mark each minterm participating in a block in the corresponding rowcolumn as illustrated below. Any column with a single entry is essential. Continuing with the example we have: B C BC -- * * * * -- * * * * -- * * * * -- * * * * -- * * * * -- * * * * -- * * * * Step 4: Remove the rows associated with essential entries along with any columns intersected by one or more of these rows. Put the terms representing the rows into the final sum. If 2 rows are identical, first eliminate based on dominance (number of s), next arbitrarily. Repeat Steps 3 and 4 until all rows are used. In the example, all rows get removed the 2 nd time step 4 is used. C D AB -- * * -- * * -- * * -- * * -- * * f(a,b,c,d) = AB + B C +B C + C D Identical rows (remove arbitrarily) Identical rows (remove arbitrarily) Repeat Steps 3 and 4 until all rows are used. Note that in the example, all rows get removed the 2 nd time step 4 is used. Step 5: When an identical row is removed arbitrarily in Step 4 (no dominance), repeat the process for the alternate case - all combinations of duplicate row elimination should be explored and the minimal expression for each case generated. The user can then select from among these (which may provide additional possibilities for combinations. In the above example, B C is present in the result given. Alternatively, B D can replace C D and AC can replace AB).

52 Page 5 Logic level: Sequential Logic Sequential logic addresses circuits that have current-state, nextstate behavior; ie., are of the form: Inputs Combinational Circuit Outputs Current State Storage Elements Next State Feedback Loop Sequential Circuit The storage elements provide current state inputs, which together with external inputs are the inputs for a combinational circuit whose outputs provide the external outputs for the sequential circuit and the next state (to be captured in the storage elements to form a feedback loop ). The circuit is clocked in the sense that the circuit only changes state when a clock signal is received; ie., the next state output is captured in the storage elements (to become the current state) only on a clock pulse, typically on a clock transition from to. A state diagram is used to specify the current-state, next-state behavior of a circuit. If there are 2 inputs, then for each state, there are up to 4 possible next states that must be specified. The fundamental circuit have current-state, next-state behavior is called a flip-flop. A flip-flop has 2 stable states ( and ); ie., it is bistable. It stores a single bit of information and maintains its state as long as power is supplied to the circuit. State change occurs only in response to a change in input values. Types of flip-flops differ as to the number of inputs and how the inputs affect the state of the device. The most basic type of flip-flop is called a latch. Latches can be used to store information, but are subject to race conditions (the latch has a setup time, during which there may be an output value that is wrong, which may race to a another part of the circuit and cause a transition that should not occur this is not an issue for combinational circuits so long as they are not being used in a sequential context).

53 Page 5 Set-Reset latches: The SR-latch formed from NOR gates is one of the fundamental latches that can be formed from basic logic gates. It has the construction: R Q S Q Each NOR gate s output is fed back to the other s input. SR stands for Set-Reset. The behavior (characteristic table) can be tabulated by S R Q Q next Q next no change reset to set to unstable active transitions occur The state diagram for an SR flip-flop is given by,, Valid inputs are,, If NAND gates are used instead of NOR, the result is called an S R -latch. S Q R Q The reason for this becomes clear if the behavior is tabulated against S and R inputs rather than S and R. Note that the behavior duplicates

54 Page 52 that of the SR latch, except for the invalid case; ie., the characteristic table is: S R Q Q next Q next no change reset to set to unstable active transitions occur It is instructive to examine timing considerations for the two cases where there are transitions as the latch sets up the new output values. Assume that it takes a discrete time interval t before the output of a gate registers so our viewpoint for the latch is R t Q Taking snapshots of Q and Q at time intervals, t, 2 t we get S R Q Q elapsed time Active Reset to S t t -- response to R = sets Q to 2 t -- response to Q = sets Q to Q Active Set to t -- response to S = sets Q to 2 t -- response to Q = sets Q to In both instances it takes 2 t for the circuit to stabilize. Flip flops are usually handled synchronously with inputs held at the the no change state until a clock pulse occurs. A gate can be used for this purpose, for example, with an AND gate: input clock output If clock = then output =. If clock =, then output = input.

55 Page 53 The basic SR-latch has no provision for clock input and so is configured for asynchronous usage. Note that since Q next is a function of S,R,Q we can derive a next state equation as follows: Q next = f(s,r,q) = Σ(,4,5) + d(6,7) from the earlier tabulation and the K-map is S RQ so Q next = S + RQ To add a control (or enable), the S R -latch turns out to be the most natural underlying latch because it responds to inverted inputs: S CS Q C R CR Q Separate preset and preclear lines can be added to allow flip-flop initialization without using the controlled inputs. Preset S CS Q C R CR Q Preclear Clock signals typically are produced in the form of square waves or regularly spaced pulses

56 Edge-triggered flip-flops: Page 54 There is a voltage setup time when the signal changes from to. The edge of the pulse for the to transition is called the leading edge, and for the to transition the trailing edge. leading edge trailing edge Voltage setup interval An edge-triggered flip-flop changes state when the edge is reached. The value of the flip-flop remains constant until the next edge is reached. There are leading edge triggered and trailing edge triggered flip-flops. Normally all flip-flops in a circuit should trigger on the same edge. If types are mixed, the leading edge can be converted to trailing edge by inverting the control input (and vice-versa). Flip flops are designated by the symbols the first for leading edge triggered and the second for trailing edge triggered. marks the control input. Note that the output for Q is marked as well. For example, the flip-flops on the SN7473 are trailing edge triggered and those on the SN7474 are leading edge triggered. Master-Slave flip-flops: A master-slave flip-flop combines two flip-flops (with controls) where the master flip-flop triggers on the leading edge. The slave flipflop then triggers on the trailing edge in response to the values of the master flip-flop. S C R Q Q Q Q Master ff: leading edge triggered Slave ff: trailing edge triggered

57 Page 55 There are two virtues to this construction:. the overall output does not change while the control input is high, since the overall output comes from the slave flip-flop, which sets up only when the control input goes low 2. the slave flip-flop is isolated from the rest of the circuit, responding only to the master flip-flop s value. (without this kind of protection in a circuit with multiple interconnected flip-flops, a race condition may occur, where an intermediate value gets latched rather than a final value). From the external view point, the master-slave flip-flop triggers on the trailing edge. A note on latches: Although basic latches should be avoided when a circuit requires multiple flip-flops, basic latches still have uses. Example: debouncing a switch the mechanical nature of a physical switch precludes a smooth transition between and when the switch is opened or closed. This phenomenom is called bounce, because the switch value may haphazardly alternate between open and closed as the switch contacts separate on opening or connect on closing. It is a simple application to debounce a single pole, double throw switch using a basic latch; eg., V cc S Q SPDT switch R Q GND The two resistors are needed to prevent a short circuit between V cc and GND for the input connected through the switch (they are called pull-up resistors because when connected between V cc and GND, they pull the voltage on the V cc side of the resistor up to logic ). When the switch as shown above is thrown to its opposite position, the flip-flop will set to the first time detected on S, and will hold that value because if a bounce takes S back to, the effect is applying, on R, S which is the no-change state of the flip-flop (ie., the flip-flop can t revert to its prior value). Generally, the term latch is only used in reference to flip-flops whose outputs are not protected from intermediate values while setting up. Unless qualified by the term latch, the use of the term flip-

58 Page 56 flop normally refers to a leading or trailing edge triggered flip-flop that is protected. The master-slave construction is one approach used for producing flip-flops. The SN7473 and SN7476 are in this category. An aside about electricity: The example of debouncing a switch may arouse curiousity regarding use and selection of resistors with TTL integrated circuits such as the SN74 (quad 2-input NAND chip). Selection requires the application of a small amount of knowledge about voltage, resistance, and electric current. Ohm s Law: Ohm s Law is the relationship between electromotive force E (measured in voltage, symbolized by V), current I (measured in Amperes, symbolized by A), and resistance or impedance R (measured in Ohms, symbolized by Ω); namely, E = IR This is closely related to Joule s Law of power (measured in Watts, symbolized by W); namely, P = EI Current is the rate of flow of electric charge in a circuit and is measured in electron charge. By international standard, Ampere of current is defined as the flow of electron charges (called a Coulomb) per second. It s rather bizarre value is derived from the number of atoms in a gram of Carbon. Note that the relationship between current and resistance is I=E/R, so current is inversely proportional to resistance at constant voltage. When plotted, the curve is I R The area under the curve is given by multiplying current by resistance; ie., it represents voltage. It is also given by the natural logarithm, as discussed in calculus classes. Standard resistor values: If the curve is scaled by the inverse of the natural logarithm of (/ln()=.4343), the area is given by the base logarithm and consequently the area between and is V. Manufacturers have chose to use impedance values that equally divide this area into 6, 2, or 24 equal subareas (the E6, E2, and E24 series). /6 =.66667=p and the impedance values are then =, p =.468, 2p =2.54, 3p =3.62, 4p =4.642, 5p =6.83. The values adopted for the E6 resister series are.,.5, 2.2, 3.3, 4.7, 6.8, which approximate the above calculations. Resistors are chosen whose E- series value is a close match for the value needed. For example, if

59 Page 57 a 5Ω resistor is needed, then 47KΩ is used from the E6 series or 5KΩ from the E24 series. If a 47KΩ resistor is used in the debouncing circuit above, and V cc is at +5V, the current flow is I = 5/47, which is.6 Amps or.6 ma, where ma designates milliamps. TTL draws no more than.4ma for to be detected at an input; ie., 47KΩ resistors are commonly used as pull-up resistors when working with TTL chips. Using a higher resistor value reduces the current draw (and thus, power consumption) but the circuit may fail to work if the power at the input is inadequate. Batteries: Batteries have an internal impedance which varies according to battery size and type. As a battery is used, its impedance grows, reducing power output. Alkaline batteries: A fully charged.5v alkaline cell will have an impedance of about.32ω, which means that the limiting current between terminals is 4.7A. NiCad batteries: NiCad batteries in contrast have about half the capacity (stored energy) of alkalines, but hold their voltage relatively constant during discharge (alkalines lose voltage linearly). The basic NiCad cell is.2v and when fully chargned has an impedance of about.2ω. yielding a maximum current of about A; ie., NiCad batteries can supply power at about twice the rate of alkalines and so are used in more power hungry applications. Following Joule s Law, Amps Volts time = Watt hours is used as a measure of power consumption. It follows that an alkaline cell can provide up to 7 Watts and a NiCad cell up to 2 Watts of power. Battery capacity is usually measured in Amp hours rather than Watt hours. Batteries in series: Putting batteries in series increases electrical potential additively; ie., two alkaline cells in series produces a 3V battery. Impedance also doubles, so there is no change in maximum discharge characteristics. Batteries in parallel: If batteries are placed in parallel, then the voltage is unaffected and the impedance is changed according to R 2 /2R = R/2. For 2 alkaline cells this is.6, increasing the discharge maximum to 9.4A or doubling its current capacity. This assumes that the batteries are matched. Note that in parallel, a weak battery will tend to discharge its companions, since Mother Nature seeks balance. Alternating current: Batteries produce direct current (DC) with current flow in one direction (source to ground). A current for which the current flow reverses direction cyclically is called alternating current (AC) and

60 Page 58 is produced by rotating a wire coil through a magnetic field. Magnets have poles (+ and -), so if the coil is first oriented + -, after a 8 rotation it will be oriented - + and the induced voltage will reverse. If the rotation is constant then the voltage will follow a sinusoidal pattern. In the US, the AC standard for house wiring is 6 cycles per second alternating between -2V and +2V. AC is used because it is relatively efficient to transform it to high voltage for transmission (which requires less current flow to move the same amount of power). Of course it has to be transformed back to safer levels for use in the home. Devices called rectifiers are used to convert AC power to DC. House current can be converted by using both a transformer and a rectifier to produce a DC output that can be used in place of a battery (just be sure that the voltage is correct for the use intended). A 6 Watt 2V light bulb requires 6/2 =.5A. Circuit capacity is limited by the amount of current the transmission wire can handle before its natural resistance causes overheating (and failure). Increasing wire diameter, or braiding together multiple wires, reduces resistance and increases capacity. To protect the transmission wire, a fuse is used to keep from overloading the circuit. A 2Amp 2V circuit can handle a load of 24 Watts (ie., two 5 Watt hair dryers will blow the fuse). D-latches and D flip-flops: A D-latch (D for delay) has the form: D Q C Q It is the (clocked) SR-latch with S fed to the R input. Hence, it triggers on the leading edge. Obviously, the same minor modification applied to the master-slave SR flip-flop covered earlier will produce a master-slave D flip-flop. The value of a D flip-flop is just the input, but one cycle behind (hence the term delay). It should be noted that a D flip-flop has only one input. An alternative construction of a D-latch: A Tri-state Buffer is a is a gate whose output can be in one of three states,,, or null (same as no contact). It has the form Ctrl Input Output When the Ctrl = then Output = Input; when Ctrl =, Output = null.

61 Tri-state buffers can be used to construct a D-latch as follows: Page 59 Clock Q D Q When the clock value goes high, output Q = input D; ie., the latch is leading edge triggered. Either this construction or the NAND construction produces a viable D-latch. Two D flip-flop constructions based on D-latches are as follows: D-latch (master) D-latch (slave) D D Q D Q Q ck ck Q ck Q Q Master-Slave D flip-flop The master-slave construction works with either version of the D-latch since both trigger on the leading edge. The overall construction is a trailing edge triggered flip-flop. The next construction uses 3 S R latches cleverly to produce a leading edge triggered D flip-flop. By inverting the clock input, the masterslave version can be converted to leading edge triggered, but it requires more logic gates.

62 Page 6 D D Q x ck y D D = D when ck = x,y = when ck = x = D when ck = y = D when ck = Q Leading Edge Triggered D Flip-flop When ck =, both x and y are held at, the no change state for the right-most latch. At the same time the upper latch outputs D and feeds it to the lower latch to produce D internally. When the clock rises to, D is latched at y and D at x, to be latched by the rightmost flip-flop as Q and Q. If D is changed while ck = and x has latched, there is no effect. If x =, then y = blocks any change in D from affecting x (the purpose of the feedback from the lower to the upper latch) and also prevents the feedback from the upper latch from affecting the lower latch. Hence, the flip-flop latches the value on the leading edge. In effect, the flip-flops in the circuit set up based on the values from the prior clock cycle, and so all inputs are stable each time the triggering edge is reached. Other flip-flops: While the SR-latch has uses in practice, the SR flip-flop does not because it does not make use of a, input. As we have seen, a D flip-flop uses a single input (other than ck). A T flip-flop also uses an single input and simply toggles the state when the input is. A JK flip-flop combines the SR flip-flop and toggles when inputs are,. T flip-flop: When T=, the flip-flop values are unchanged. When T=, the next state is the opposite of the current state. Hence, the characteristic table for the flip-flop is given by:

63 Page 6 T Q Q next toggle when T = Q next = TQ + TQ = T Q JK flip-flop: This flip-flop just combines the functions of the SR and T flipflops and so is widely used. Its characteristic table is given by J K Q Q next K-map analysis shows that Q next Excitation controls: = JQ + KQ. In a circuit, flip-flop inputs have to be set to produce desired nextstate behavior. This is trivial for the D flip-flop. For the JK flipflop excitations are given by Q present Q next J K d d d no change reset to set to toggle Excitation: J,K values as a function of Q and Q next d Any flip-flop has present-state, next-state capabilities, so any flipflop type can be produced from any other flip-flop type. Example: A T flip-flop from a JK flip-flop T J K Q Q ck

64 Page 62 Example: A JK flip-flop from a D flip-flop The key to the construction is to set it up as follows: J K A Combinational circuit that uses both external and current state values to determine the controls that produce the spec d next state Q Q ck Type of flip-flop being created Type of flip-flop being used This guides the table to construct as follows: J K Q Q next D JK spec D controls producing spec d next state J KQ D = J Q + K Q 7 6 so our diagram becomes J K Q Q ck

65 Page 63 Example: Make up your own flip-flop and construct it from JK flip-flops Specify the characteristic table and the JK excitations that will produce the same next state behavior. U N F Q Q next J K characteristic equation of the flip-flop: Q next = U N F Q + U N F Q UN FQ UN FQ J = UNF K = U+N+F U N F J K Q Q ck

66 Page 64 Example: Race Condition S D C D R clock Assume that leading edge-triggered D flip-flops are being used (say of the type described earlier). Then for to transition on the latch enabled by the control line C, any of (,), (,), (,) may be latched depending on when the clock rises. Note that even if the control line is controlled by the clock, it could rise t ahead of the clock signal at the D flip-flops, the point at which the latch outputs are (,) when an active transition is in progress. Registers: A row of associated flip-flops in series or in parallel is called a register. The combinations are: serial in, serial out (slow devices) serial in, parallel out (slow in, fast out) parallel in, serial out (fast in, slow out) parallel in, parallel out (fast in, fast out) A shift register uses serial in, serial out. input D D D D output clock Every clock pulse the flip-flop values shift one to the right. The left-most flip-flop obtains its new value from the input line and the value of the right-most flip-flop is the output at each clock pulse. It should be noted that this requires all leading edge or all trailing edge flip-flops to work properly. If the output is fed back to the input, the shift is called a circular shift. Three-state logic is needed to construct a shift register that can shift in either direction.

67 Page 65 shift right input left input right D D D D output right shift left output left In contrast, parallel input has the appearance i 3 i 2 i i D D D D ck Counters: Counters are often needed to control tasks such as count by 8 to shift in 8 bits ( byte) serially. T flip-flops provide a natural means for constructing a mod 2 n ripple counter (counts cyclically to 2 n -). It can be initialized to via the clear input provided on most flip-flops. ck enable Q Q J J J K K K Q 2 If trailing edge flip-flops are used, then when enabled, the counter operates according to Q changing with the clock falling, Q with Q falling, and Q 2 with Q falling as given by: count clock Q 2 Q Q Q falls

68 Sequential circuit design: Page 66 Sequential circuits make transitions from state to state in response to inputs. Sequential circuits are physical realizations of a kind of theoretical machine called a finite state automaton (FSA). An FSA can be described by use of a graphical representation called a state diagram. An FSA is given by specifying:. An input alphabet I 2. An output alphabet O (possibly NULL) 3. A finite set of states S 4. A start state, s S 5. A transition function f:s I S (this is the next state function, where f(current-state, current-input) = next-state) 6. Moore circuit output is on the state (may be NULL) an output function g:s O is given 7. Mealy circuit output is on the transition (may be NULL) an output function h:s I O is given Examples:. Serial parity checker input is data (having a parity bit) and output is the current parity bit (odd parity) Input alphabet is {,} Output alphabet is {,) States are {S, S } S is the start state The transition function is given by the state diagram (Moore circuit) S / S / S I S output S S S S S S S S The parity bit is an added data bit used to check for occurrence of an error in data. It is commonly employed with memory circuits, where any error indicates a serious problem (usually a failed memory chip). The parity bit is usually appended to the data bits. For odd parity, the added parity bit is selected so that the total number of s is odd. For even parity, it is selected so that the total number of s is even. For example, if the data is and odd parity is being used, then the data including the parity bit is parity bit For the parity-checking FSA, data is input serially and the current state outputs the bit needed for odd parity. Note the boundary condition when no data has been input (empty input), the parity bit is. If the 9-bit example above is sent through the parity checker and the output of the final state does not agree with the parity bit, a parity error has occurred.

69 Page Sequential binary adder input is pairs of binary digits and output is their sum; carry-in, carry-out information is tracked by the current state. Input alphabet {,,,} Output alphabet {,} States are {S, S, S 2, S 3 } as follows: S outputs, no carry S outputs, carry S 2 outputs, no carry S 3 outputs, carry Transitions are given by the state diagram S /, S 2 /, States with no carry, S /, S 3 / States with carry Trace: + The input pairs are (,),(,),(,),(,),(,). For [current-state, current input] the transitions are [S, ] S output to carry state [S, ] S output remain in carry state [S, ] S 2 output to no carry state [S 2, ] S output to carry state [S, ] S 2 output to no carry state (final) so the result is as expected. Generally, the structure of the FSA can be determined from the state diagram, so usually only the state diagram is specified in the design process. The next step is to detail how the FSA is converted to a circuit.

70 The sequential circuit design process is conducted as follows:. Problem statement 2. State diagram 3. Elimination of inaccessible states (if any) these are states that cannot be reached from the Start State 4. Assignment of states to flip-flop combinations: # of states # of ff s needed or 2 3 or 4 2 5,6,7 or and so forth Page Transition/output table control values producing the needed next state behavior are determined from flip-flop excitation tables current states inputs next states controls outputs 6. K-map analysis to produce control equations output equations 7. Circuit diagram Example: Parity checker using JK flip-flops. Steps and 2 were done earlier. There are no inaccessible states. Step 4: Assignment of states to flip-flop combinations. Since there are only 2 states, flip-flop (Q ) can represent both. State Q S S S / S / Step 5: Transitiion table Q I next Q J K Z Q I Step 6: K-map analysis for J,K and Z Q I - - Recall: JK flip-flop excitation table Q Q next J K J=I, K=I Z= Q

71 Page 69 Step 7: Circuit for parity checker I clock Example: Binary adder using JK flip-flops J K Q Z Steps and 2 were done earlier. Step 3: There are no inaccessible states. Step 4: Assignment of states to flip-flop combinations. Since there are 4 states, 2 flip-flops (Q,Q ) will be needed. State Q Q S S S 2 S 3 Step 5: Transition/output table Q Q I I n Q n Q J K J K Z S - - S S S S - - S S S - - S - - S S - - S 2 S S S - - S S 3 S - - S - - S S /, Recall: JK flip-flop excitation table Q Q next J K S 2 /,, S /, S 3 /

72 Page 7 J, K, J, K can be resolved via K-maps. Note that J and K observe an XOR pattern. I I Q Q I I Q Q J K J K J = Q (I I ) + Q (I I ) = Q I I K = Q (I I ) + Q (I I ) = Q I I J = I I K = I I = I I By observation, Z = Q The circuit construction is then given by: Z J Q J Q I I K Q K Q Counter design: Counters can have particularly simply design. For example, a BCD counter has the state diagram:

73 Page 7 Transitions are made with the clock. External inputs are not required. States are named using flip-flop values. The transition/output table is then Q 3 Q 2 Q Q Q n 3 Q n 2 Q n n Q J 3 K 3 J 2 K 2 J K J K the rest are don t cares Q Q Q 3 Q Q Q Q 3 Q J 3,K 3 J 3 = Q 2 Q Q K 3 = Q Q Q Q 3 Q J 2,K 2 J 2 = Q Q K 2 = Q Q Q Q Q 3 Q J,K J = Q 3Q K = Q J,K J = K = The counter operates synchronously with the clock. Note that Q is common to each of J 3,K 3,J 2,K 2,J,K. Hence if we assign CK 3 =Q, J 3 =Q 2 Q,

74 Page 72 and K 3 =, we have the same effect as the original assignment when the clock is high. Likewise assign CK 2 =Q, J 2 =Q, K 2 =Q and CK =Q, J = Q 3, K =. The counter now operates asynchronously with the clock attached to CK. Observe that the Q flip-flop is operating as a T flip-flop (not a surprise since the s position of the counter toggles with each increment). Moore and Mealy circuits: For a Moore circuit, the outputs are strictly a function of the states. For a Mealy circuit, the outputs are a function of the inputs as well as the states. For example, input clock output input clock output Moore circuit Mealy circuit Circuit Analysis: reverse the design process. Produce control and output equations from the circuit 2. Generate the transition/output table from the equations 3. Determine the next state columns in the transition/output table 4. raw the state diagram Example: starting from the following circuit diagram, assume that the start state is (Q,Q ) = (,) I J Q J Q Z I K Q K Q Circuit equations: J = I, K = I I J = I +Q +I, K = I Q Z = I Q Q + I

75 Page 73 Transition/output table: Q Q I I n Q n Q J K J K Z S S S S 2 S 2 S S S S S S 2 S 2 S S S 3 S 2 S 3 S S S 3 State diagram: (Mealy circuit) / /,/ / S / S 2 / S / /, / /, / / / S 3 / / / Remark: the semantic for the circuit can only be inferred from the state diagram; also, don t care conditions used in the original design are unknown since they are accounted for in the circuit. Example: Given the control and output equations J = X Y J = Y Q + Q Z=Q K = X Q + Q K = Q the transition/output table is given by

76 Page 74 Transition/output table: S S S 2 S 3 Q Q X Y Q n Q n J K J K Z S S 2 S 2 S S S 2 S 2 S S S S S S S S S State diagram: assume (,) is the start state (Moore circuit), S /,, S 2 /,,, S /,,, S 3 / The isolated state is an artifact of the circuit implementation Other Counters: The first counter considered was a mod 2 n ripple counter, a natural counter formed by hooking T flip-flops up in series. It required no additional gate logic and was easily devised without resorting to sequential design techniques. In contrast, the BCD counter exemplifies designing a counter by working from a state diagram. In the BCD counter as given, no attention was paid to the 6 states present in the circuit but not used in the counting process. In particular, if the circuit initiated in one of these 6 states, its behavior would be unspecified. Hence, the user must initialize the flip-flops to to assure that the counter gets to the BCD counting sequence. A self-starting counter is one which transitions to its counting sequence regardless of the state in which the circuit is initiated.

77 Page 75 A counter that employs n flip-flops is called an n-stage counter. Using a state diagram in designing a counter automatically minimizes the number of stages, but there are useful counters that employ more than the minimum. A shift-register counter counts by using a circular shift to move a bit pattern through the register. For example, to count 4, the pattern might be,,,. The register layout is initialize D D D D clock There are reasons to use this kind of counter (e.g., to produce a sequence of polling signals, where each flip-flop enables the device being polled). There are 2 other bit patterns:,,, and,,,, and These are grouped according to how they would count (the first group has two patterns that count 4, the second group has a pattern that counts 2, and the last groups has two patterns that count ). It s obvious that initialization is important if this kind of counter is to be employed. The counter can be constructed to force it to move to the desired counting sequence by adjusting the D input (currently Q 3 ) for those cases that are not in the right sequence. Q Q Q 2 Q 3 D (force change from Q 3 ) (OK) (OK) (force change from Q 3 ) (OK) (force change from Q 3 ) (self-correct on later cycle) (force change from Q 3 ) (OK) (force change from Q 3 ) (self-correct on later cycle) (force change from Q 3 ) (self-correct on later cycle) (force change from Q 3 ) (self-correct on later cycle) (force change from Q 3 )

78 Page 76 From this it can be seen that D 3 = Q Q Q 2 rather than D 3 = Q 3 will cause the counter to fall into the,,, pattern within 3 clock cycles. The counter thus becomes self-starting. The initialization can be retained, but the above minor change enables the counter to return to its expected behavior in the event an anomalous event knocks the counter out of sequence at some point after initialization. To make a counter self-starting, any unused states simply need to be accounted for in the design. For example, for a counter counting 6 (which requires a minimum of 3 flip-flops), the state diagram Ex 2 3 Ex accounts for the 2 extra states that will occur when a circuit is implemented using 3 flip-flops and ensures that the counter will be in its counting sequence within clock cycle. Johnson counter: an n-stage count by 2n counter based on shiftregister counting. Johnson counters cycle the complement of the final flip-flop in the sequence to double the counting period. For example, a count by 8 Johnson counter has the form: initialize D D D D clock With a counting sequence,,,,,,, Note that 8 states are unused, so either the counter has to be forced to its counting sequence, or it has to be initialized.

79 Barrel Shifter: Page 77 Recall that a shift register shifts bit at a time. If the shift amount is more than, then the process has to be repeated until the specified amount of shifting has been accomplished. A barrel shifter uses multiplexers to determine the shift so that it can be accomplished in one cycle. For a 4-bit register, a barrel shifter accomplishing a circular shift right of,, 2, or 3 (specified via (s,s )) is structured as follows: D D D D to MUX to MUX to MUX to MUX s s Note that each flip-flop is controlled by a multiplexer, which is used to select the input sent to the flip-flop. The multiplexer's function is to route the value selected according to its address lines to the flip-flop's input. To set up the circuit as a shift register, the 4 multiplexer input data lines are simply hooked up to the flip-flop outputs so that each address matches a shift value, with address matching a shift of, address matching a shift of, and so forth. Thus, the amount of the shift is entered via the address lines (S,S). The circuit can be reconfigured for different shift patterns by simply hooking up the multiplexer input data lines to the flip-flop outputs (or other data values) in different ways.

80 Page 78 Glitches and hazards Physically, there is a time lag in a combinational circuit from the point in time that input signals are applied until their effect propagates through the various components of the circuit and the outputs react to the inputs. This is called the propagational delay. A manufacturer may include the expected propagational delay as a part of circuit specifications. Propagational delay is a physical reality with consequences that may affect circuit behavior, particularly that of a sequential circuit. To illustrate this point, consider the circuit given by f = AB + AC Assume that f is implemented by A B C )t )t 3 f )t 2 where t, t 2, t 3 give the propagational delay associated with the delineated components. Assume also that t > t 2 For purposes of illustration, suppose that the inputs A, A,B, B,C, C changing (synchronously) according to the timing pattern are Logic A B C Logic Logic Logic Logic Logic

81 Page 79 If we extend the timing diagram to track the circuit components as they react to the inputs using a similar timing diagram, we obtain the following: A delay t A A B C B A C B + A C delay t 2 delay t 3 glitch delays t and t 2 coupled with the changing values of A, B, C produce a signal variance in the expected value of f that would not happen in the absence of propagational delay. This variance, called a circuit glitch, appears in the form of a brief pulse, which could possibly trigger a state change elsewhere in the circuit. The component organization which causes it is called a hazard. An examination of the K-map for f is instructive in determining the source of the hazard. A BC As can easily be determined, the formulation for f we started with is in fact a minimal sum-of-products expression, using two of the three prime implicants of f. These two prime implicants (A B and A C) cover the third prime implicant, B C, which is usually considered unnecessary, since logically f= AB + AC = AB + AC + BC

82 Page 8 Assuming some appropriate propagational delay for BC ( t 4 ), consider what happens to the timing diagram when using the formulation f= AB + AC = AB + AC + BC A B A A C B A C B + A C BC A B + A C + BC It is evident that adding the logically redundant term back into the expression has eliminated the glitch! There are some subtle points to consider. The assumption that inputs A, A, B, B, C, C change synchronously is critical. For example, consider the following (admittedly nonsensical) construction using separate NOT and AND gates (with propagational delays as indicated): A A A t t 2

83 Page 8 The timing diagram for this circuit is as follows: A A delay t delay t 2 A A Even for a simple construction such as this (or a similarly constructed prime implicant) a glitch is experienced. In our first example, such problems were avoided by synchronizing all inputs (including complements) to the prime implicants. In practice, glitches are not a great concern for combinational circuits (especially since the outputs are typically used to drive devices slow to react, such as lights). Although it is possible that a glitch's duration may be too short for a component such as a flip-flop to react, their presence is an obvious cause for concern in sequential circuits, which may change state unexpectedly (and hence perform incorrectly) on a misplaced signal pulse. In general, when inputs to prime implicants (or implicates) are synchronized in a combinational circuit, circuit hazards occur where two prime implicants (or implicates) that are non-overlapping have adjacent cells. Adding back the logically redundant (non-essential) prime implicants (or implicates) serves to eliminate the hazards causing such glitches. It should be noted that under this scenario, there may be a dramatic difference in the choice between using the sum-of-products form or the product-of-sums form. For example, consider the K-map A BC There are 6 prime implicants and 2 prime implicates, yielding the following two hazard free formulations: A B + B C + A C + A B + B C + A C (A + B + C)( A + B + C ) It is evident that the product-of-sums expression is simpler. This occurs because the removal of circuit hazards from the sum-of-products

84 form requires adding in prime implicants that are logically nonessential or redundant. Page 82 There are other alternatives. If a particular input combination triggering a glitch does not occur in actual implementation, then the associated hazard does not need to be addressed. Another strategy is to employ a flip-flop (perhaps a D flip-flop on the trailing edge of the input synchronization) to latch the value of f at a point after all glitches have occurred. Under this scenario, the circuit performance is slowed until the flip-flop outputs are set. A third alternative is to use an added synchronizing signal to hold the output at a (known) fixed value until the danger of glitches is past. In general this strategy takes the form: I n p u t s. Glitch-prone combinational circuit. S y n c h r o n o u s O ut p u t s synch signal In this case, there is the added complication of having to provide careful timing for the added "synch" signal. By setting the synch signal to at the beginning of each cycle of input synchronization, all outputs of the glitch prone circuit can be held at through the setup period when glitches are likely to occur, regardless of the presence of hazards. When the chance for glitches is past, the synchronizing signal is then changed to allow each output to cleanly switch to its logical value for the current set of inputs. This strategy does not have the longer implicit delay that is present in our second alternative, but does require close coordination with the system signal that is being used to synchronize the circuit inputs. At this point it should be noted that every strategy for dealing with glitches (even the one of removing hazards) has an element of synchronization associated with it. This is the primary reason that asynchronous sequential circuits have limited utility.

85 Page 83 Constructing memory: Generally a memory block is organized to have address lines to determine which bits in the block to access bidirectional data lines to send data to an addressed location in memory (write operation) or retrieve data from the addressed location (read operaton) a R/W line to specify a read operation or a write operation an enable line to activate the memory block for read or write access A single bit is a block of memory and can be represented by a flipflop (no address line is needed). enable (CS) D Bi-directional data line R/W (R=,W=) A 2 block of memory can now be constructed from two cells. A of 2 decoder is needed to address the cell wanted: addr CS R/W CS d R/W R/W CS A 4 block can be constructed from two 2 blocks using a of 2 decoder, or from four blocks using a of 4 decoder. These two equivalent constructions appear as:

86 Page 84 (addr) a CS R/W 2 R/W CS d (addr) a 2 of R/W CS 4 CS R/W CS of 2 CS a a CS R/W CS d R/W CS R/W R/W CS Note that for the construction using two 2 cells, a selects a 2 cell and a selects a bit from within the cell. In effect, when larger memory blocks are constructed from smaller memory blocks, the higher order bits of the address are used to select one of the smaller blocks and the lower order bits are used to select the data item from within the selected smaller block. The memory modules in the 4 block can be arranged to construct a 2 2 block with 2 data lines, instead of : a 2 2 d CS R/W CS R/W CS d R/W A memory chip has a fixed capacity in bits, which can be organized either in favor of addressibility (4 requires more address lines than 2 2) or in favor of data groups (2 2 provides 2 bits per data group vs. bit for 4 ).

87 Page 85 Note that in general, accessing a location in memory requires a large decoder. In practice, a of 6 decoder requires a 24 pin package (4 address lines, 6 data lines, V cc, GND, and CS), which indicates building a large decoder as a single chip is impractical. However, it is easy to build larger decoders from smaller ones; for example, a of 64 decoder can be constructed from five of 6 decoders as follows: a 5 a 4 a 3 a 2 a a of 6 CS... CS of 6 CS... (unused) of 6 CS of 6 CS output lines of 6 CS... This structure can obviously be extended to provide a decoder for any address requirement (albeit by using a lot of chips; for this reason, address decoding is normally a built-in feature of a memory chip). Hence, arbitrarily large memory blocks can be constructed. Memory is generally classified as RAM memory - Random Access Memory (so-called since any randomly generated address can be accessed directly, which contrasts to a serial memory such as a magnetic tape). RAM memory can also be both read from and written to. ROM memory Read Only Memory (non-volatile memory with a fixed content that can be read from, but not written to). There are multiple varieties or ROM, some of which can be rewritten and some not. For example, EPROM (electrically programmable ROM) is ROM which can be erased (by ultra-violet exposure) and is written by special circuitry operating at a higher voltage; PLA s

88 Page 86 (programmable logic arrays) start as a rectangular array of fuseable links which when selectively blown to create (permanent) bit patterns that then form a ROM; FPGA s (field programmable gate arrays) are another variation, and can be rewritten with special circuitry; CD-ROM s are yet another and may be either rewriteable or not. Memory is organized in a 2 i 2 j array of bits with i address lines and 2 j data lines. The number of data lines is called the word size of the memory. If j=3, then the word size is 8. Since 8 bits is a byte, the memory would then have a capacity of 2 i. If j=5, then the word size is 32, or 4 bytes. A memory has 8 address lines (since 2 8 = 256) and 8 data lines. For RAM memory, data lines are bi-directional and the memory includes both R/W and enable control lines. The overall memory configuration has the appearance: address lines bi-directional data lines R/W enable As already noted, larger memory units can be constructed from smaller ones by arranging the blocks in a grid, tying all R/W lines together, and using a decoder to select rows. Example: Construction of a 256 byte memory with word size of 4 bytes using 6 byte memory modules. The specification calls for 256/4 = 64 words. Each word has 4 bytes, so there are 32 data lines. To get 64 words using 6 byte modules, there needs to be 64/6 = 4 rows, each having 4 modules. 256 bytes requires 6 address lines. Hence, the memory should appear as a 4 4 grid with 6 address lines and 32 bi-directional data lines.

89 Page 87 address a 5 a 4 a 3 a 2 a a 6 bytes 6 bytes 6 bytes 6 bytes CS CS CS CS 6 bytes 6 bytes 6 bytes 6 bytes CS of 4 CS CS 6 bytes CS 6 bytes CS 6 bytes CS 6 bytes CS CS CS CS 6 bytes 6 bytes 6 bytes 6 bytes CS CS CS CS d d...d 7 d 8 d 9...d 5 d 6 d 7...d 23 d 24 d 25...d 3 Note that the high order bits of the address are tied to the of 4 decoder and the 4 lower order bits address the memory modules across each row. The decoder activates a row and the lower order bits select a word within that row. The R/W lines are omitted because they are all tied together. The addressing requirement can be reduced by using memory blocks that require 2 select (enable) inputs (S,S 2 ). S S 2 R/W data Arranging these in a rectangular grid effectively halves the decoder requirement; eg., a 256=2 8 byte memory module requires a of 256

90 Page 88 decoder. If blocks using 2 select inputs are employed, and the memory is arranged in a 6 6 grid, with a of 6 decoder accessing s lines and another of 6 decoder accessing s lines, then all blocks are accessed and only 32 decoder lines have been used instead of 256! Building decoding into a memory module obviously reduces the need for large external decoding circuits. Memory sizes are given by employing standardized prefixes as follows: International Unit (base ) Prefixes, yotta 2... zetta 8... exa 5... peta 2... tera 9... giga 6... mega 3... kilo 2... hecta... deca -... deci centi milli micro nano pico femto atto zepto yocto These are used directly with base measures; e.g., picosecond ( trillionth of a second = -2 ) millimeter ( thousandth of a meter = -3 ) megaflop ( million floating point operations per second = 6 ) They are also used with measures based on K = 2 = 24 = 3 ; eg., gigabyte ( billion bytes) Sequential circuit clock speed is measured in Hertz where Hertz Hz cycle per second. This is a measure CPU manufacturers often cite with respect to processor speed (e.g., a 2.5GHz processor has speed measured in GigaHertz). Example: MegaHertz = MHz = 6 Hz = 8 cycles per second. 8 cycles per second is / 9 seconds per cyle or nanoseconds per cycle. Memory generally operates at slower speeds than the processor, which means it is accessed asynchronously (on a different clock timing). A delay of 3 nanoseconds is 3/ 9 seconds. If signals need to occur at no more than /3 rd this rate, then clock pulses are limited to 9/ 9 implying a clock speed of no more than 9 /9, no faster than MHz.

91 Page 89 Implementing Circuits Using ROMs: We have already observed that combinational circuits can be implemented by discrete logic gates or by using higher order circuits such as decoders and multiplexers. They can also be implemented by using ROMs. Combinational circuits: The information in the truth table specification for a combinational circuit can be viewed as specifying the contents for a ROM implementation of the circuit; e.g., the circuit specification for the function f below can be implemented by an 8 ROM whose contents are the given by the specification: Specification for f X Y Z f Address Contents : : Inputs = Address : X = A 2 : Y = A : Z = A : : : 8 ROM Data = Output f For contrast, recall the alternative approaches for the same specification as illustrated below: K-map analysis and logic gate implementation: X YZ X Y Z f f = Y Z +XZ = ( Y + Z )+XZ

92 Page 9 Multiplexer implementation: 2 3 X Y Z f Z Z Z Z X Y MUX f Decoder implementation: X Y Z f () (4) (5) (7) X Y Z f of 8 decoder The K-map logic gate approach requires the most analysis but uses the simplest components. The contrasting ROM implementation requires the least analysis, but this advantage is offset by having to burn the desired contents into each ROM memory cell. Sequential Circuits: In a similar fashion, in the transition/output table for a sequential circuit, the current state and input columns can be viewed as providing ROM addresses that point to memory locations where the next state information is stored. Circuit outputs can likewise be stored at ROM addresses pointed to by the current state and input columns. To illustrate this, consider the following state diagram for a sequential circuit:

93 Page 9 A,D/ C/ A,B,C,D/ C/ D/ C/ S/ A/ T/ D/ T2/ A/ T3/ T4/ A/ B/ B/ B/ H/ B,C,D/ A,B,C,D/ Suppose that the states and inputs are encoded as follows and that state outputs are given by variables O, O 2, O 3, O 4 as indicated: States Inputs States State Output Q 2 Q Q X Y O O 2 O 3 O 4 S A S T B T T 2 C T 2 T 3 D T 3 T 4 T 4 H H The transition/output table corresponding to the state diagram and based on this encoding is as follows:

94 Page 92 S Q 2 n Q n Q n Q 2 Q Q X Y M 2 M M Z T T 2 T 3 T 4 H Current State Address Contents Q 2 =A 4 Q 2 : D 2 Q =A 3 : M 2 : Q =A 2 : M : X=A : M : Q Y=A : Z D : : : : : : Q : D : : : : : : : X : : Y : : : : : : : : 32 4 ROM Address Contents Q 2 =B 2 : O : Q =B : O 2 : Q =B : O 3 : : O 4 : 8 4 ROM Z is the output on transitions. Note that the current state information is maintained in the 3 D flip-flops given by Q 2, Q, Q. The next state is given by the memory data lines labeled M 2, M, M. The M 2, M, M output values are applied to the inputs of the D flip-flops, ready to be latched on the transition to the next state. Since the output associated with each state does not rely on the transition inputs X, Y, a smaller memory unit is sufficient for representing this requirement of the circuit specification. The implementation is almost a direct transliteration of the truth table specification for the circuit, which requires considerably

95 less analysis than implementing the circuit using gate logic. The downside again is the need to program ROMS for the circuit specification. If FPGA s or similar ROMs are available along with the means to program them, then this approach becomes a good choice for implementation, especially in light of the fact that it requires relatively few connections. Hamming code: Adding a parity bit to a sequence of data bits provides an encoding of the data that enables detection of the presence or absence of error in the data, so long as at most bit is at fault. If there is an erroneous bit, the approach does not identify it, however. The idea of parity bits can be easily extended to provide means for not only detecting the presence of an erroneous bit, but also the means for locating and correcting it. This kind of encoding is called an error correcting code. There are error correcting code techniques that will detect and correct multiple bit errors. Hamming code provides an introduction to the idea behind these coding techniques. We will only consider Hamming s single error detection/correction code. First view data as occurring at positions,2,3,... To show the concept, we first limit ourselves to 5 data positions. Consider the position numbers listed in binary and observe the column patterns of s: dcba a: s at,3,5,7,9,,3,5 2 3 b: s at 2,3,6,7,,,4,5 4 5 c: s at 4,5,6,7,2,3,4,5 6 7 d: s at 8,9,,,2,3,4, Any position is identified by the columns it has s in (ie., 3 occurs only in columns a,c,d and 2 only occurs in column b). To elaborate, if there is a bit error in position 3, then a parity check of the positions given by d will identify the problem bit as being in one of 8,9,...,5. A parity check of the positions given by c reduces this list to one of 2,3,4,5. A parity check of b doesn t include 3 and so doesn t find any error, which eliminates 4,5 and reduces the list to one of 2,3. Page 93

96 A parity check of the positions given by a identifies 3 as the culprit. Page 94 To summarize, if parity checks are conducted on the bit positions identified by s in each of columns a,b,c,d then an error in a bit position will result in an parity error for or more of these checks. The combination of the parity errors precisely locates the bit position causing the error. There are data positions, and we need 4 parity bits, so that leaves up to bits available for user data. With 5 parity checks, there would be 3-5=26 bits available for user data. If we assume the data is in bytes (ie., we have 8 user bits), then adding on 4 bits for parity checking results in a 2 bit encoding of the data. If the parity bits are simply appended to the user bits, then some difficulty will occur in setting them. This can be avoided if the parity bits are placed at the positions which occur in only column (those with a single, position,2,4,8). If the user data is, then it is encoded as where positions,2,4,8 receive the corresponding parity check. For even parity, this determination is as follows: parity at position is parity at position 2 is parity at position 4 is parity at position 8 is The encoded user data is then If the data is transmitted and the received data is then the parity checks result in a) - OK b) - error c) - error d) - OK which identifies position = 6 as the one in error.

97 Note that setting the parity bit can be accomplished simply by using XOR; e.g., if the code word is notated by C[ ], then the parity bits are obtained by C[] C[3] C[5] C[7] C[9] C[] C[2] C[3] C[6] C[7] C[] C[] C[4] C[5] C[6] C[7] C[2] C[8] C[9] C[] C[] C[2] Page 95 If an overall parity check is included at position, then the Hamming code word extended by this bit becomes a single error correcting, double error detecting code. The following 4 cases cover all possibilities for 2 or fewer errors:. no parity error, no Hamming error no error detected 2. no parity error, Hamming error double error detected 3. parity error, no Hamming error parity bit in error 4. parity error, Hamming error correctable error detected This is easy to see: If there are no errors, there are no parity errors for any of the checks and no error correction is needed. This is the no parity error, no Hamming error case. If 2 bits are in error in the overall code word, then the overall parity will be unaffected; ie., the overall parity check will find no error. On the other hand, since at least one of the errant bits is in the Hamming code word, the Hamming parity checks will flag an error. This is the no parity error, Hamming error case, and flags occurrence of a double error. In this case error correction no longer applies, since there is no way to determine which 2 bits are in error, even if one of them happens to be the parity bit, but the double error has been detected. If a single bit is in error then an overall parity error will be flagged. If the bit is the parity bit, then the Hamming code word generates no errors. This is the parity error, no Hamming error case, and the parity error can be corrected by changing the parity bit (so single error correction remains in effect). If a single bit is in error and it is in the Hamming code word, then the Hamming parity checks locate the position of the bit. This is the parity error, Hamming error case, and the error can be corrected using the Hamming decoding technique. This covers all possibilities of,, or 2 errors being present. If more than two errors are present, one of these cases will occur, but the result will be erroneous.

98 Page 96 Computer Systems Level Representing numeric fractions: Earlier we examined data representation formats for integers, Boolean values, and characters. A full processing environment also needs to include a representation format for fractions. The systems circuitry that implements these kinds of data manipulations is called the arithmetic and logic unit (ALU). One of the things that has to be considered in designing a system is whether a feature should be implemented in hardware or software. For example, floating point numbers can be implemented either in circuitry or by software. If implemented in software, the specification for the representation format can be easily changed. If implemented in hardware, then it is advantageous to use a representation standard, since changes at the hardware level carry more severe penalties than changes at the software level. The term floating point numbers is used because the representation employed is based on scientific notation where the value is approximated by floating the decimal point until only one digit is to the left of the decimal point, marking the magnitude by keeping track of the power of necessary to restore the decimal point s location. Hence, the basic format has the form: <±> <d.ddd...d> <exponent> for example, or An arithmetic operation for numbers in this format utilizes the arithmetic operations for integers, but requires special handling for exponents and normalization. Normalization is the process of manipulating a result by adjusting the exponent, floating the decimal point until there is only one digit to its left. Normalization example: normalizes to [normalize by adding 2 to the exponent] (In this case the exponent has been decreased by 2 to float the decimal point two positions to the left) normalizes to [normalize by subtracting 6 from the exponent] (In this case the exponent has been increased by 6 to float the decimal point six positions to the right). Multiplication and division are straight forward. Multiplication and division examples: (2. - ) ( ) = = [set the sign, multiply the mantissas, add the exponents, normalize and round] (2. - ) ( ) = = [set the sign, divide the mantissas, subtract the exponents, normalize and round]

99 Page 97 Addition and subtraction require exponent manipulation since the digits have to the lined up according to position. Addition/subtraction example: ( ) + (9.3 4 ) = ( ) + (9.3 4 ) = [adjust the number with the smaller magnitude to match the exponent of the one with larger magnitude, then add/subtract the mantissas, normalize and round] Another way to look at this is that addition and subtraction require moving the decimal point for the smaller number until the magnitudes of the two numbers match. In the computer context, base is not the natural base to employ. In particular, on IBM mainframes (36 series), floating point numbers are hexadecimal based, using a 64-bit format developed by IBM for their systems. On these systems, IBM also employed its own character encoding format (EBCDIC). For obvious reasons, it is not advisable for a single manufacturer to dictate standard formats, so neutral groups, in which representatives from many manufacturers participate, develop and promulgate standards for general adoption by industry. Industry recognizes that lack of standard representation formats complicates the portability of data among systems. Systems that do not conform to standards eventually lose market appeal as more and more competing companies adopt recognized standards. As discussed earlier, the ASCII character encoding format has been widely adopted and integers are now almost always represented in 2 s complement, rather than the s complement format. The most widely adopted floating point standard is the IEEE 754 standard. It employs the biased exponent concept used in IBM s format, but in contrast employs a base 2 format rather than hexadecimal. Note that it is the binary point that floats, rather than the decimal point. In contrast to 2 s complement, there is no natural underlying finite algebra for floating point numbers. Hence, a sign-magnitude representation, with its implicit complications for managing arithmetic, is employed. For this reason, in early computational machines, floating point computations were almost always handled via software to hold down the size of the computational circuits. Floating point circuitry is now integrated into most processors and for almost all of them is compliant with the IEEE standard.

100 Page 98 IEEE 754 Floating Point Standard The IEEE 754 floating point standard provides a standard way of representing fractional quantities based on standard scientific notation (in base 2). The basic components for representing a number x are organized: (-) <sgn> 2 (<exponent> - <bias>).<mantissa> ± exponent mantissa bit single precision base 2 exponent biased by 27 (range -26 to 27) [true exponent is (<exponent> - )] ± exponent mantissa bit double precision base 2 exponent biased by 23 (range -22 to 23) [true exponent is (<exponent> - )] ± exponent mantissa bit extended precision base 2 exponent biased by 6383 (range to 6383) [true exponent is (<exponent> - )] An exponent of all 's is used to show an exception: with a mantissa of it represents ±, depending on the sign; otherwise the mantissa provides the designation for an illegal operation. For an exponent not all 's (and not all 's), the number is in normalized form, meaning the exponent and mantissa have been adjusted to produce a mantissa of the form.xxx... xxx. In the representation, the leading is an implied leading (providing an extra bit of precision). This is the usual way numbers are represented in floating point. For an exponent all 's, the number is too small to be normalized and so is represented unnormalized. is given by a mantissa of and the minimum exponent (all 's). Example: =. 2 or. 2 * (2 7 ) The biased exponent is 27+7 = 34 = 2 In IEEE 32 bit format:

101 Page 99 Guard bits, rounding: Guard bits are extra bits maintained during intermediate steps to minimize loss of precision due to use of routine arithmetic operations and rounding. The implied under the IEEE format limits precision loss under multiplication, since the result of multiplying mantissas will always be greater than or equal to. However, the simple multiplication of binary floating point values, =. 2, illustrates that a right shift may be needed to normalize the result (in this case to. 2 ) and that the number of significant bits may double. A right shift may result in loss of precision since a significant bit may get shifted off of the end. By carrying an extra bit during intermediate steps, this effect can be countered. If an extra bit is carried for rounding, then an additional guard bit is needed to prevent an intermediate right shift from shifting away the rounding bit. Rounding strategies: The rounding technique is important, because you don t want loss of precision to cascade to a significant error when multiple calculations are being performed; hence, to be viable, the rounding strategy must balance, rounding up half the time and the other half rounding down.. truncation As a strategy, truncation is not viable since it always rounds down (up if the number is negative) 2. Von Newman rounding The strategy is to always set the least significant bit to ; e.g., internally the IEEE mantissa (implied leading ) is carried with two extra bits and has the form.dddd... lee least significant bit guard bits (rounding and shift protect).ddd... d and.ddd... d round up to.ddd... d.ddd... d and.ddd... d round down to.ddd... d ie., half the time rounding is up and the other half it is down. 3. True rounding is the opposite of Von Neuman rounding.dddd... if ee= or ( 2).dddd... lee.dddd... if ee= or (< 2) Note that this simply requires assigning the first guard bit at end of computation to be the least significant bit. Other considerations: Addition/subtraction can add to precision loss; for example, =. 2 =. 2-3 has operands with 4 significant figures and a result that has only. If significant figures disappeared via earlier computations in obtaining the operands, then data which could be present in the final result has been lost. This suggests that the results of extended calculations in floating point should be carried in the highest precision format available, a function of programming rather than hardware.

102 Page Rules for processing floating point numbers: Multiplication: The format for floating point numbers (-) <sgn> 2 (<exponent> - <bias>).<mantissa> is implicitly multiplicative, so determining the result requires XOR the sign bits Add the exponents: Since (<exponent >-<bias>) + (<exponent 2 >-<bias>) = (<exponent >+<exponent 2 >) - 2<bias> the hardware approach is Add the biased exponents obtained from the IEEE representation and subtract the bias Multiply the mantissas, including the implied, and round the result; if the first bit of each mantissa in the IEEE format is, decrement the exponent by (corresponds to floating the binary point left by ) Division: (-) <sgn> 2 (<exponent> - <bias>).<mantissa> (-) <sgn> 2 (<exponent2> - <bias>).<mantissa2> = (-) (<sgn> <sgn2>) 2 (<exponent> - <exponent2>).<mantissa>)/(.<mantissa> so the procedure is XOR the sign bits Subtract the biased exponents obtained from the IEEE representation and add the bias Divide the mantissas, including the implied, and round the result; if the dividend is less than the divisor, increment the exponent by (corresponds to floating the binary point right by ) Note: if the dividend is less than the divisor, the result is less than and a normalization step is needed; however, the worst case scenario is a dividend of. 2 and a divisor of which is greater than (/2) and so normalization still only moves by position. Addition/Subtraction: these are easily implemented for integers, but require a good bit more attention for floating point. Addition/Subtraction: Increment the smaller exponent to match the larger one and shift its mantissa (including the implied ) to the left by the increment amount. Note that this is the opposite of normalizing. Process addition/subtraction according to the signs of the two values, round, and normalize the result.

103 Register transfer logic: Page A computing device generally transforms data from one form to another over a series of steps. This is a characteristic of finite state automata, so in its concept a computing device is a (large) finite state machine. It is impractical to concoct a monolithic finite state automaton to describe a computer, so its architecture is instead described in terms of components and their interfaces. We have now seen how to construct sequential circuits that are large memory modules. We also have seen how specialized memory elements called registers can be used to provide the operands for data manipulation techniques such as arithmetic operations, comparison operations, shift operations, and the like. The results of such an operation, if not done directly on the register (such as happens with shift), can be captured in a target register. Conceptually, it appears wise to view memory and manipulation of data in different contexts, one for the storage and retrieval of data, and the other for performing operations on data. Registers are used to hold data retrieved from memory (or data ready to be stored in memory), where it can be accessed for data manipulation needs. Data can be easily moved from register to register; for example, to load two registers providing the operands for an adder circuit, or moved from a target register to a register designed to hold data ready to be stored in memory (ie., a register whose outputs are connected to memory data lines). Register transfer logic organizes registers in a manner which provides means of moving data among registers for purposes of applying various data manipulations to the data contained within them. A register transfer architecture provides an abstracting realization of register transfer logic, conceptualizing data transfer and control in the following manner: Memory I/O data registers connected along a bus clock control input status control module (has its own internal working registers) signals control output f e e d b a c k

104 Page 2 There may be more than one control module deployed. Control signals may need to be generated from outside the control module or passed on to other modules. The clock synchronizes control and data elements and may be suspended by a control signal (e.g., to allow asynchronous transfer of data to or from memory). A register transfer language (RTL) provides a means for instantiating control modules and register elements for accomplishing a task. Transfer/control statements are executed sequentially with the clock. There is no standard register transfer language, but basic elements can be represented using the following notation:. data manipulation transfer (assignment operator): A B copy (transfer non-destructively) the contents of register B to register A access: A[i] access bit i of register A operators: +,,, =,... apply bit-wise across either selected bits, or whole register 2. control conditional execution: (<condition>) <statement> example: (C) A B if condition C=, the transfer occurs, if C= it does not branch: [<cond-> <step-> <condition-n> <step-n>] the next step is changed to the first one in the branch statement having a true condition; if no conditions are true, don t branch (proceed to next step) Each step can have both data manipulation and control parts. Transfers expressed on one line are assumed to be parallel. Example: assuming register bits are numbered left to right starting from, then A[], A[] A[], A[2] A[], A[3] A[2] is a right shift by of the bits referenced in register A (ie., each bit is copied to its neighbor before it is reset). Example: a fragment of a sequence of RTL steps Step: A B (transfer B to A) [A[] Step3] (if A[]=, branch to Step-3) Step2: A A (complement A) Step3: C A (transfer A to C) C receives either A or A depending on the value of A[]. An RTL program simply describes next state behavior, and so is a more abstract way to describe a circuit than can be accomplished using state diagrams. Operations (such as floating point arithmetic) which can be done in circuitry using sequential logic are one example of the kinds of circuitry that may be best described in RTL.

105 Page 3 Example: Consider the RTL sequence C : A B C : A[] A[], A[] A[2], A[2] A[3], A[3] A[] (left circular shift) [SEQ C ] C 2 : A[] A[]+ A[], A[2] A[2]+ A[3] C 3 : A[] A[] A[], A[3] A[2] A[3] [C ] The steps in the control sequence correspond to states, and so can be represented by using 2 flip-flops. The overall circuit then has the appearance: input data (B) control input control signals D O A[] SEQ control combina tional logic (TBD) D D Q Q of 4 2 C C C 2 C 3 transfer combinational logic (TBD) D D O O 2 A[] A[2] o u t p u t d a t a ck D O 3 A[3] feedback Since 6 flip-flops are needed to describe the control logic and provide the 4-bit data register A, if a state diagram approach was employed, the circuit would require 2 6 =64 states! The remaining work is to fill in the two combinational circuits noted as TBD. The program sequence occurs as follows: current control state next control state SEQ= SEQ= C C C C C 2 C C 2 C 3 C 3 C 3 C C

106 Page 4 Control combinational logic: From the control state transitions we can determine the control combinational logic using sequential circuit design, starting from a state diagram as follows:, C C C 2, Q Q Q Q SEQ Q n n Q 2 C C C 2 C 3 The circuit is then C 3, Q Q SEQ Q SEQ Q D = Q Q + Q Q S EQ D = Q Q SEQ D Q D

107 Page 5 Transfer combinational logic: There are 4 transfer statements, each of which requires its own combinational logic and each of which must be activated when its control signal (C,C,C 2, C 3 ) is raised. This is handled by using an AND gate with each control signal to activate/deactivate the appropriate transfer. input data (B) C C D O A[] A +A C 2 C C A A C C C 3 D O A[] o u t p u t C 2 C 3 A 2 +A 3 C C C 2 D O 2 A[2] d a t a C C D O 3 A[3] A 2 A 3 C 3 ck feedback Transfer combinational circuit Register Transfers Required C : A B C : A[] A[], A[] A[2], A[2] A[3], A[3] A[] (left circular shift) C 2 : A[] A[]+ A[], A[2] A[2]+ A[3] C 3 : A[] A[] A[], A[3] A[2] A[3]

108 Page 6 UNF RTL: A Register-Transfer Language Simulator High-level programming languages are usually portable across multiple environments, because they are designed to be used at a level of abstraction above physical implementation. They also tend to have a large user base. In contrast, RTL implementations (even more so than machine and assembly languages) tend to be tailored for a specific manufacturer s needs; ie., there is no standard RTL. Elsewhere defined RTL circuit modules can also be employed (in the manner of subprograms) if there is a language context in which they are described. UNF RTL is an implementation of an RTL for a simulated machine environment. It has its own syntax and semantics, and can be used to verify register-transfer functionality for microcode-level algorithms. It does not incorporate any timing capabilities, which would normally be desirable in an implementation to be used for actual computer circuit construction. We will illustrate its functionality via a series of programs describing sequential circuits (including ones for specialized arithmetic). I. UNF RTL: Basic Structure An RTL program consists of the following three sections:. DEFREG - define registers. 2. DEFBUS - define buses. 3. Control section bracketed by BEGIN and END. For example, DEFREG: REG(6) ** REG is a 6 bit register REGISTER2(8) ** REGISTER2 is an 8 bit register ACC(32) ** ACC is a 32 bit register DEFBUS: MAINBUS(32) ** MAINBUS is a 32 bit bus LASTBUS(8) ** LASTBUS is an 8 bit bus BEGIN:... ** Register transfer and manipulation statements.... END: It is assumed that a transfer from one register to another does not require explicit representation of a bus structure. Defined buses are assumed to have a bus sense register to maintain any value transferred onto the bus. The purpose of having buses is to support communication among separately defined modules by explicitly representing the data path. II. UNF RTL: Naming the Registers and Buses DEFREG, DEFBUS, BEGIN, END, and the operator names (see next section) are reserved words. Register and bus names must start with an upper-case letter and may have up to twenty upper-case alphabetic

109 Page 7 and numeric characters. The number enclosed in parentheses indicates the number of bits in the register being declared or the path width of the bus being defined (number of bits). For example, REG23XYZ(32) defines a register with the name REG23XYZ having 32 bits (bits,,2,...,3). Bits in a register or bus are indexed from left to right beginning with. III. UNF RTL: Labels, Conditional Execution, Conditional Branch, Merge Statements in the control section (between the BEGIN and END brackets) may optionally start with a label and/or a condition. <Label>: (<condition>) <...Register transfer statement...> Examples: label condition RTL statement L: (X[5 6] LEQ ) X[ TO 7] SETREG X[ TO 7] SUB Y M23: REG SETREG REG2 A[3 4] SETREG B[2 2] Labels follow the same formation rules as those used for naming registers and buses. A label is terminated with a colon. A condition is an expression involving current contents of registers and buses and should evaluate to either or. The statement following the condition is executed if the condition evaluates to, otherwise it is ignored. Statements without a "pre-condition" are executed when encountered. In addition to the conditional execution discussed above, there is a conditional branching capability. The syntax is as follows: <Label>:(<c >) BRANCH (<c >;<L >)(<c 2 >;<L 2 >)(<c 3 >;<L 3 >)... (<c N >;<L N >) The execution of the BRANCH statement is conditioned on <c > if present. If the BRANCH statement is executed, the (<condition>;<label>) pairs are considered from left to right and the first condition to evaluate to causes a BRANCH to the corresponding label. If none of the conditions evaluate to, then execution proceeds to the next sequential line. An unconditional branch is provided to simulate a merging of control signals. The syntax is: <Label>:(<c >) MERGEAT <Lbl> Examples: BRANCH (SC ANEQ ; L) (X[];L2) MERGEAT TOP IV. UNF RTL: Assignment Statements, Register Transfer, Expressions Assignment statements simulate transfer of bit strings between registers and buses. <Busname> SETBUS <expression> <Regname> SETREG <expression> The expression on the right of the SETREG or SETBUS command indicates processing of current contents of registers and/or buses,

110 Page 8 the result of which is transferred to the register or bus named on the left hand side of the SETREG or SETBUS command. For example, LASTBUS SETBUS REG34 indicates that the contents of REG34 are to be sent to LASTBUS; REG8 SETREG LASTBUS means that the current set of signals on the LASTBUS is to be copied to REG8; REG9[7 8] SETREG R2[4 9] OR BUS[2 3] specifies that the sub-register REG9[7 8] (bits 7 and 8 of REG9) is to receive the result of bit-wise OR'ing the contents of the subregister R2[4 9] and sub-bus BUS[2 3]. An expression may be formed by applying the following rules:. A binary vector is a term; e.g., 2. A register name or a bus name is a term. 3. A sub-register or a sub-bus is a term; e.g., R[ 4 6 7] or BUS9[ ] 4. Concatenation of terms is a term (binary vectors must be enclosed in parentheses when involved in concatenation); concatenation is indicated by using a comma "," between terms; e.g., R[4 5 6],( ),BUS27[6 7] 5. A term (as defined in through 4) is an expression. 6. An expression enclosed in parentheses is a term. 7. <term> <binary-operator> <expression> is an expression. 8. <unary-operator> <expression> is an expression. Two reserved bus names (INBUS, OUTBUS) are used for simulated I/O. Expressions using these bus names provide simulated input (with prompt - from keyboard) and output (to screen - with optional MESG text, if desired), their syntax is 9. INBUS '<input-prompt-message>' Either of the statements REG SETREG INBUS 'enter an 8 bit integer' REG5[ TO 7] SETREG INBUS 'enter 8 bits' first sends the prompt message to the display, then accepts user input from the keyboard.. OUTBUS SETBUS <register> MESG '<message>' where MESG is a reserved word, optionally included along with its '<message>' to specify the addition of the <message> to the <register> display; e.g., OUTBUS SETBUS REG3 MESG 'this is reg3' appends the message text to the display of the contents of REG3. NOTE: INBUS is read only. OUTBUS is write only.

111 V. UNF RTL: Operators Page 9 A list of dyadic (requiring two operands) and monadic (requiring one operand) operators follows: Dyadic Operators: Standard Boolean Logic Operations OR, AND, NAND, NOR, XOR, COINC For example, NOR results in Logical and Arithmetic Shifts (Left and Right), Rotate (Circular Shift, Left and Right) LLSHIFT, RLSHIFT, LASHIFT, RASHIFT, LROTATE, RROTATE For example, 3 RLSHIFT results in 3 RASHIFT results in 3 RROTATE results in Two's Complement Arithmetic ADD, SUB, MUL, DIV For example, MUL results in Logical (Unsigned) Compare and Arithmetic (Signed) Compare LGT, LLT, LGE, LLE, LEQ, LNEQ AGT, ALT, AGE, ALE, AEQ, ANEQ For example, LLE results in ALE results in String Manipulation FIRST, LAST For example, 4 FIRST results in 4 LAST results in Reformat of User Input under INBUS dectotwo, hextotwo For example, 8 dectotwo -5 results in 8 hextotwo A9 results in Monadic Operators: Standard Boolean Logic Operations NOT For example, NOT results in Increment by, Decrement by INCREMENT, DECREMENT For example, INCREMENT results in DECODE, ENCODE, twoscmpl, ZERO, twotodec, twotohex DECODE performs the function of a of 2 n decoder so DECODE gives (activating bit number 6 of the 8 bits) (bit 6) ENCODE is the inverse of decode so ENCODE results in

112 VI. Page twoscmpl simply forms the 2's complement of its argument so twoscmpl results in ZERO returns a string of 's of the given length so ZERO 5 results in twotodec converts 2's complement to a decimal value for an address or output; e.g., twotodec returns -3 twotohex converts a binary string into hexadecimal notation; eg., twotohex returns D Evaluation of Conditions A condition is either an expression (as defined in section IV) or two expressions connected by one of the comparison operators. A condition may appear as a "pre-condition" (in front of any statement) or as the first component in a (<condition>;<label>) pair. If a condition takes the form of an expression without a comparison operator, it should evaluate to a or. If a logical comparison operator is used, the resulting bit strings on both sides of the comparison operator are treated as unsigned integers in making the comparison. Arithmetic comparisons treat the operands under the assumption they are in 2's complement representation.

113 Page UNFRTL Examples Generic RTL example of a simple register transfer sequence C : A B C : A[] A[], A[] A[2], A[2] A[3], A[3] A[] (left circular shift) [SEQ C ] C 2 : A[] A[]+ A[], A[2] A[2]+ A[3] C 3 : A[] A[] A[], A[3] A[2] A[3] [C ] UNFRTL program providing an implementation of the sequence [] RtlSIMPLX [] DEFREG: [2] SEQ() [3] A(4) [4] B(4) [5] DEFBUS: [6] BEGIN: smultip [7] C:B SETREG INBUS 'Enter 4 bit B input' [8] SEQ SETREG INBUS 'Enter SEQ value' [9] A SETREG B [] C:A SETREG LROTATE A [] OUTBUS SETBUS A MESG 'Register A left rotated by - ' [2] BRANCH(SEQ;C) [3] C2:A[ 2] SETREG ((A[] ADD A[]), (A[2] ADD A[3])) [4] OUTBUS SETBUS A MESG 'Register A with A[ 2] added - ' [5] C3:A[ 3] SETREG ((A[] XOR A[]), (A[2] XOR A[3])) [6] OUTBUS SETBUS A MESG 'Register A with A[ 3] XORed - ' [7] MERGEAT C [8] END: Lines 9,, 3, 5 are the statements providing the actual register transfer specified by C, C, C 2, C 3

114 Page 2 Signed multiply: Architecture: 3 n-bit registers X, A, Y -bit register SGN (A,Y) can be treated as a single 2n bit register for shifting ± SGN X register shift counter ADD Y A register Y register X = multiplicand, Y = multiplier, A = accumulator The sign of the product is first determined by (sgn(x) sgn(y)) and stored in SGN. X and Y are changed to their absolute values so that the arithmetic only has to deal with positive integers. Basic procedure for multiplying (positive) integers X and Y: Clear A X <multiplicand> Y <multiplier> REPEAT IF Y = A A + X ENDIF Shift (A,Y) right by bit UNTIL there have been n shifts When done, the product will be in (A,Y). The 2's complement form is then produced according to the sign value found in SGN.

115 Page 3 UNFRTL program for implementing the procedure for multiplication with extensions for accomodating the sign; X and Y are assumed to be 2's complement sign + 7 integers. [] RtlSMULTIPLY [] DEFREG: [2] AY(6) [3] X(8) [4] SC(8) [5] SGN() [6] DEFBUS: [7] BEGIN: [8] AY[ TO 7] SETREG ZERO 8 ** Clear accumulator A [9] SC SETREG ** Shift counter initially 8 [] X SETREG 8 INBUS 'Enter multiplicand (8 bit 2''s complement)' [] AY[8 TO 5] SETREG 8 INBUS 'Multiplier (8 bit 2''s)' [2] SGN SETREG X[] XOR AY[8] ** Set sign bit for the product [3] BRANCH(NOT X[]; CKY) [4] X SETREG twoscmpl X ** change sign of X if X < [5] CKY:BRANCH(NOT AY[8]; L) ** and do likewise for Y [6] AY[8 TO 5]SETREG twoscmpl AY[8 TO 5] [7]** Accumulate in A if rightmost bit of Y= (recall: AY[5] Y) [8] L:(AY[5]) AY[ TO 7] SETREG AY[ TO 7] ADD X [9] OUTBUS SETBUS AY MESG 'REG AY ' [2] AY SETREG RLSHIFT AY ** Shift AY right [2] OUTBUS SETBUS AY MESG 'shf AY ' [22] SC SETREG DECREMENT SC ** Decrement seq counter [23] BRANCH(SC ANEQ ; L) ** Repeat if shift counter =/ [24] AY SETREG RLSHIFT AY ** Shift to clear the sign bit [25] BRANCH(NOT SGN;D) [26] AY[ TO 5] SETREG twoscmpl AY[ TO 5] [27] D:OUTBUS SETBUS AY[ TO 5] MESG 'PRODUCT ' [28] OUTBUS SETBUS (twotodec AY[ TO 5]) MESG '(base )' [29] END: Execution trace: RtlSMULTIPLY (input data is - and ) Enter multiplicand (8 bit 2's complement): Multiplier (8 bit 2's): REG AY ( ) shf AY ( ) REG AY ( ) shf AY ( ) REG AY ( ) shf AY ( ) REG AY ( ) shf AY ( ) REG AY ( ) shf AY ( ) REG AY ( ) shf AY ( ) REG AY ( ) shf AY ( ) REG AY ( ) shf AY ( ) PRODUCT ( ) (base ) ( - )

116 Page 4 Booth's method for multiplying 2's complement integers: Architecture: 3 n-bit registers X, A, Y a -bit register P (A,Y,P) can be treated as a single 2n+ bit register for shifting X register shift counter ADD/SUB Y PP A register Y register X = multiplicand, Y = multiplier, A = accumulator, P = prior bit from multiplier In contrast to the "Signed Multiply" procedure, Booth's method requires no independent consideration of the sign of the multiplicand and multiplier. The basic procedure for multiplying 2's complement integers X and Y is as follows: Clear A X <multiplicand> Y <multiplier> P REPEAT CASE (Y,P ) = : A A - X (Y,P ) = : A A + X ENDCASE Shift (A,Y,P) right arithmetically by bit UNTIL there have been n shifts When done, the product will be in (A,Y). Remark: The first time a appears in position Y, X will be subtracted. If the next value to appear in Y is, X will then be added. Because of the shift, the effect is equivalent to having added 2X at the preceding step, which then has the combined effect over the two steps of adding 2X - X = X. If the next value to appear in Y had been, and then following that, then two shifts would take place before adding X, yielding a combined effect of 4X - X = 3X over the three steps (note that multiplying by is the same as multiplying by 3, so adding 3X is exactly what is desired). Thus, the procedure produces the desired outcome for patterns in the multiplier of,,,... allowing us to conclude that it will work in general. Note that the procedure works regardless of sign. If the multiplier is negative, its lead bits are 's, and so the procedure simply winds out with a series of

117 Page 5 shifts once it gets into the leading 's of the multiplier. Similarly, if the multiplier is positive, its lead bits are 's and the procedure likewise winds out with a series of shifts once it gets into the leading 's of the multiplier. Trace of Booth's method: 8 bit registers, (-) x (9) = -29 X = A Y P -X = (-X: ) subtract (A A - X) A Y P and then : shift right (arithmetic) : shift right (arithmetic) (+X: ) : add (A A + X) A Y P and then : shift right (arithmetic) : shift right (arithmetic) (-X: ) : subtract (X X - ) A Y P and then : shift right (arithmetic) (+X: ) : add (X X + ) A Y P and then : shift right (arithmetic) : shift right (arithmetic) shift right (arithmetic) Product ( 2 = -29 )

118 Page 6 UNFRTL program for implementing Booth's procedure for multiplication; X and Y are assumed to be 2's complement sign + 7 integers. [] RtlBOOTHMULT [] DEFREG: [2] AYP(7) [3] X(8) [4] SC(8) [5] DEFBUS: [6] BEGIN: [7] AYP[ TO 7]SETREG ZERO 8 ** Clear accumulator A [8] SC SETREG ** Shift counter initially 8 [9] X SETREG 8 INBUS 'Enter Multiplicand (8 bit 2''s complement)' [] AYP[8 TO 5]SETREG 8 INBUS 'Multiplier (8 bit 2''s)' [] AYP[6]SETREG ** initialize P to [2] OUTBUS SETBUS(twoTOdec X)MESG '(base Multiplicand)' [3] OUTBUS SETBUS(twoTOdec AYP[8 TO 5])MESG '(base Multiplier)' [4]** Cases: (recall: AY[5] Y and AY[6] P) [5] L:(AYP[5 6] LEQ )AYP[ TO 7]SETREG AYP[ TO 7]SUB X [6] (AYP[5 6] LEQ )AYP[ TO 7]SETREG AYP[ TO 7]ADD X [7] OUTBUS SETBUS AYP MESG 'REG AYP ' [8] AYP SETREG RASHIFT AYP ** right arithmetic shift [9] OUTBUS SETBUS AYP MESG 'shf AYP ' [2] SC SETREG DECREMENT SC ** Decrement shift counter [2] BRANCH(SC ANEQ ;L) ** Repeat if shift counter =/ [22] D:OUTBUS SETBUS AYP[ TO 5]MESG 'PRODUCT ' [23] OUTBUS SETBUS(twoTOdec AYP[ TO 5])MESG '(base )' [24] END: Execution trace: RtlBOOTHMULT (input data is - and 9) Enter 8 bit Multiplicand: Enter 8 bit Multiplier : (base Multiplicand) ( - ) (base Multiplier) ( 9 ) REG AYP ( ) shf AYP ( ) REG AYP ( ) shf AYP ( ) REG AYP ( ) shf AYP ( ) REG AYP ( ) shf AYP ( ) REG AYP ( ) shf AYP ( ) REG AYP ( ) shf AYP ( ) REG AYP ( ) shf AYP ( ) REG AYP ( ) shf AYP ( ) PRODUCT ( ) (base ) ( -29 )

119 Page 7 Restoring and non-restoring division: Architecture: 3 n-bit registers: A, X, Y -bit sign registers SGNQ and SGNR (A,X) can be treated as a single 2n-bit register for shifting. Y register shift counter ADD/SUB A register X register X ± SGNQ ± SGNR sign rules: <dividend> = <quotient>*<divisor> + <remainder> sign(<quotient>) = sign(<dividend>) sign(<divisor>) sign(<remainder>) = sign(<quotient>) sign(<divisor>) (for instance, 6/-5 = - r ; -6/-5 = r -) Using these rules, the sign of the quotient is stored in SGNQ and that of the remainder in SGNR. The basic procedure for restoring division, positive integers X and Y is as follows: Clear A X <dividend> Y <divisor> REPEAT Shift (A,X) left by bit A A - Y IF A < A A + Y /* "restore" A */ X /* set least significant bit of X */ ELSE X ENDIF UNTIL the register has been shifted n times When the algorithm terminates, register A has <remainder> register X has <quotient> (register Y, the <divisor>, is unchanged) At this point, the values in SGNQ and SGNR are used to establish the correct 2's complement form for the quotient and the remainder.

120 Page 8 Trace of restoring division: 8 bit registers, (74) /(25) = 2 r 24 A X Y = -Y = ShiftL A X? Sub Y A X? A < ; set vacated A X bit to : (and restore A) ShiftL A X (: marks quotient so far) :? Sub Y A X :? A < ; set vacated A X bit to : (and restore A) ShiftL A XGr :? Sub Y A X :? A < ; set vacated A X bit to : (and restore A) ShiftL A X :? Sub Y A X :? A < ; set vacated A X bit to : (and restore A) ShiftL A X :? Sub Y A X :? A < ; set vacated A X bit to : (and restore A) ShiftL A X :? Sub Y A X :? A < ; set vacated A X bit to : (and restore A) ShiftL A X :? Sub Y A X :? A > ; set vacated A X bit to : ShiftL A X :? Sub Y A X :? A < ; set vacated A X bit to : (and restore A) <remainder> <quotient>

121 Page 9 UNFRTL program for implementing the procedure for restoring division with extensions for accomodating the sign; X and Y are assumed to be 2's complement sign + 7 integers. [] RtlRESTORING [] DEFREG: [2] AX(6) [3] Y(8) [4] SC(8) [5] SGNQ() [6] SGNR() [7] DEFBUS: [8] BEGIN: [9]** Initialize first half of AX register to zeroes [] AX[ TO 7] SETREG ZERO 8 []** Initialize shift counter to 8 [2] SC SETREG [3] AX[8 TO 5] SETREG 8 INBUS 'Enter 8 bit dividend (2''s comp)' [4] Y SETREG 8 INBUS 'Enter 8 bit divisor (2''s comp)' [5] SGNQ SETREG AX[8] XOR Y[] [6] SGNR SETREG SGNQ XOR Y[] [7] BRANCH(NOT AX[8]; CKY) [8] AX[8 TO 5] SETREG twoscmpl AX[8 TO 5] [9] CKY:BRANCH(NOT Y[]; L) [2] Y SETREG twoscmpl Y [2] L:AX SETREG LLSHIFT AX [22] SC SETREG SC SUB [23] OUTBUS SETBUS AX MESG 'shf AX ' [24] AX[ TO 7] SETREG AX[ TO 7] SUB Y [25] OUTBUS SETBUS AX MESG 'SUB ' [26] BRANCH(AX[ TO 7] ALT ; RESTORE) [27] AX[5] SETREG [28] OUTBUS SETBUS AX MESG 'set ' [29] MERGEAT TST [3] RESTORE:AX[ TO 7] SETREG AX[ TO 7] ADD Y [3] OUTBUS SETBUS AX MESG 'restore ' [32] TST:BRANCH(SC AGT ZERO 8; L) [33] BRANCH(NOT SGNR; CKQ) [34] AX[ TO 7] SETREG twoscmpl AX[ TO 7] [35] CKQ:BRANCH(NOT SGNQ; D) [36] AX[8 TO 5] SETREG twoscmpl AX[8 TO 5] [37] D:OUTBUS SETBUS AX[8 TO 5] MESG 'QUOTIENT ' [38] OUTBUS SETBUS (twotodec AX[8 TO 5]) MESG '(base )' [39] OUTBUS SETBUS AX[ TO 7] MESG 'REMAINDER' [4] OUTBUS SETBUS (twotodec AX[ TO 7]) MESG '(base )' [4] END:

122 Page 2 Execution trace: RtlRESTORING (input data is 74 and 25) Enter 8 bit dividend (2's comp): Enter 8 bit divisor (2's comp): shf AX ( ) SUB ( ) restore ( ) shf AX ( ) SUB ( ) restore ( ) shf AX ( ) SUB ( ) restore ( ) shf AX ( ) SUB ( ) restore ( ) shf AX ( ) SUB ( ) restore ( ) shf AX ( ) SUB ( ) restore ( ) shf AX ( ) SUB ( ) set ( ) shf AX ( ) SUB ( ) restore ( ) QUOTIENT ( ) (base ) ( 2 ) REMAINDER ( ) (base ) ( 24 )

123 Page 2 The basic procedure for non-restoring division, positive integers X and Y is as follows: Clear A X <dividend> Y <divisor> Shift (A,X) left by bit A A - Y REPEAT IF A < X /* set least significant bit of X */ Shift (A,X) left by bit /* Remark: below */ A A + Y ELSE X Shift (A,X) left by bit A A - Y ENDIF UNTIL the register has been shifted n times (including the initial shift) IF A < X A A + Y /* the only time A is "restored" */ ELSE X ENDIF When the algorithm terminates, register A has <remainder> register X has <quotient> (register Y, the <divisor>, is unchanged) Remark: Shifting and then adding Y as done above is equivalent to adding Y (to restore A), then shifting, and then subtracting Y as done in the restoring algorithm. This is true because a left shift has the effect of multiplying by 2; i.e.,. in the non-restoring algorithm when A < : Y was subtracted initially; a shift has the effect that 2Y is now subtracted; adding Y leaves the effect of a single subtraction of Y for the next pass (without having to restore!). 2. in the restoring algorithm when A < : Y was subtracted initially; Y is added back to restore; a shift now has no effect on the Y arithmetic; Y must now be explicitly subtracted for the next pass.

124 Page 22 Trace of non-restoring division: 8 bit registers, (74) / (25) A X Y = -Y = Shift A X? Sub Y A X? A < ; set vacated A X bit to : ShiftL A X (: marks quotient so far) :? Add Y A X :? A < ; set vacated A X bit to : ShiftL A X :? Add Y A X :? A < ; set vacated A X bit to : ShiftL A X :? Add Y A X :? A < ; set vacated A X bit to : ShiftL A X :? Add Y A X :? A < ; set vacated A X bit to : ShiftL A X :? Add Y A X :? A < ; set vacated A X bit to : ShiftL A X :? Add Y A X :? A > ; set vacated A X bit to : ShiftL A X :? Sub Y A X :? A < ; set vacated A X bit to : and restore A <remainder> <quotient>

125 UNFRTL program for implementing the procedure for non-restoring division with extensions for accomodating the sign; X and Y are assumed to be 2's complement sign + 7 integers. [] RtlNRESTORE [] DEFREG: [2] AX(6) [3] Y(8) [4] SC(8) [5] SGNQ() [6] SGNR() [7] DEFBUS: [8] BEGIN: [9]** Initialize first half of AX register to zeroes [] AX[ TO 7] SETREG ZERO 8 []** Initialize shift counter to 8 [2] SC SETREG [3] AX[8 TO 5] SETREG 8 INBUS 'Enter 8 bit Dividend' [4] Y SETREG 8 INBUS 'Enter 8 bit Divisor ' [5] SGNQ SETREG AX[8] XOR Y[] [6] SGNR SETREG SGNQ XOR Y[] [7] (AX[8]) AX[8 TO 5] SETREG twoscmpl AX[8 TO 5] [8] (Y[]) Y SETREG twoscmpl Y [9] AX SETREG LLSHIFT AX [2] OUTBUS SETBUS AX MESG 'shf AX ' [2] AX[ TO 7] SETREG AX[ TO 7] SUB Y [22] OUTBUS SETBUS AX MESG 'SUB ' [23] L:SC SETREG SC SUB [24] BRANCH(SC ALE ZERO 8; CHK) [25] (AX[]) MERGEAT ADDY [26] AX[5] SETREG [27] OUTBUS SETBUS AX MESG 'set ' [28] AX SETREG LLSHIFT AX [29] OUTBUS SETBUS AX MESG 'shf AX ' [3] AX[ TO 7] SETREG AX[ TO 7] SUB Y [3] OUTBUS SETBUS AX MESG 'SUB ' [32] MERGEAT L [33] ADDY: AX SETREG LLSHIFT AX [34] OUTBUS SETBUS AX MESG 'shf AX ' [35] AX[ TO 7] SETREG AX[ TO 7] ADD Y [36] OUTBUS SETBUS AX MESG 'ADD ' [37] MERGEAT L [38] CHK:(AX[]) MERGEAT REST [39] AX[5] SETREG [4] OUTBUS SETBUS AX MESG 'set ' [4] MERGEAT CKQ [42] REST:AX[ TO 7] SETREG AX[ TO 7] ADD Y [43] OUTBUS SETBUS AX MESG 'ADD ' [44] CKQ:(SGNR) AX[ TO 7] SETREG twoscmpl AX[ TO 7] [45] (SGNQ) AX[8 TO 5] SETREG twoscmpl AX[8 TO 5] [46] D:OUTBUS SETBUS AX[8 TO 5] MESG 'QUOTIENT ' [47] OUTBUS SETBUS (twotodec AX[8 TO 5]) MESG '(base )' [48] OUTBUS SETBUS AX[ TO 7] MESG 'REMAINDER' [49] OUTBUS SETBUS (twotodec AX[ TO 7]) MESG '(base )' [5] END: Page 23

126 Page 24 Execution trace: RtlNRESTORE (input data is 74 and 25) Enter 8 bit Dividend: Enter 8 bit Divisor : shf AX ( ) SUB ( ) shf AX ( ) ADD ( ) shf AX ( ) ADD ( ) shf AX ( ) ADD ( ) shf AX ( ) ADD ( ) shf AX ( ) ADD ( ) shf AX ( ) ADD ( ) set ( ) shf AX ( ) SUB ( ) ADD ( ) QUOTIENT ( ) (base ) ( 2 ) REMAINDER ( ) (base ) ( 24 )

127 Page 25 Floating point operations: Floating point operations can likewise be implemented in RTL. We will only illustrate this for floating point add (IEEE 32 bit format). Assume that both numbers are positive (if one is positive and one is negative, then the add operation becomes subtract; if both are negative, then the operation is add with the sign set to negative). Architecture: two 32-bit floating point registers F,F2 two 2-bit registers GB,GB2 for guard bits two 2-bit registers IB,IB2 for manipulating the implied Each of (IB,F[9 3],GB) and (IB2,F2[9 3],GB2) can be treated as a single 27-bit register for addition and shifting. The basic procedure for floating point addition/subtraction, numbers X and Y is as follows: Clear GB, Clear GB2, Clear IB, Clear IB2 IB[], IB2[] /* Set the implied s */ F <addend> F2 <addend2> /* Make F the larger of the two numbers in magnitude */ IF F[ 8] < F2[ 8] Swap(F,F2) ENDIF IF F[ 8] = F2[ 8] IF F[9 3] < F2[9 3] Swap(F,F2) ENDIF ENDIF /* shift F2 s mantissa to line up the two exponents */ Shift (IB2,F2[9 3],GB2) right by (F[ 8] F2[ 8]) bits IF F[] = F2[] /* add mantissas including the extra bits */ (IB,F[9 3],GB) (IB,F[9 3],GB) + (IB2,F2[9 3],GB2) ELSE /* subtract */ (IB,F[9 3],GB) (IB,F[9 3],GB) - (IB2,F2[9 3],GB2) IF (IB,F[9 3],GB) is all zeroes Clear F /* special case when result is */ Exit ENDIF ENDIF /* normalize by shifting until IB has the implied */ IF IB[] = Shift (IB,F[9 3],GB) right by F[ 8] F[ 8] + /* increment the exponent */ ENDIF normalization WHILE IB[] = Shift (IB,F[9 3],GB) left by F[ 8] F[ 8] - /* decrement the exponent */ ENDWHILE F[3] GB[] /* round the result */ Remark: specialed cases (eg., exponent all s) are not considered.

128 Page 26 UNFRTL program for implementing the procedure for implementing 32-bit IEEE add/subtract (decided by the signs of the numbers) [] RtlADDIEEE [] * Addition or subtraction determined by signs [2] DEFREG: [3] * Mantissa has 2 for overflow, for implied, 2 as guard bits [4] MANT(28) [5] MANT2(28) [6] * Exponent has 2 for overflow; special cases are not handled [7] EXP() [8] EXP2() [9] SGN() [] SGN2() [] DEFBUS: [2] BEGIN: [3] * Clear extra bits and set implied ones (assumes normal case) [4] MANT[ ]SETREG [5] MANT2[ ]SETREG [6] EXP[ ]SETREG [7] EXP2[ ]SETREG [8] SGN SETREG INBUS 'st numner - enter bit sign ' [9] EXP[2 TO 9]SETREG 8 INBUS 'Enter 8 bit exponent ' [2] MANT[3 TO 25]SETREG 23 INBUS 'Enter 23 bit mantissa ' [2] SGN2 SETREG INBUS '2nd number - enter bit sign 2' [22] EXP2[2 TO 9]SETREG 8 INBUS 'Enter 8 bit exponent 2' [23] MANT2[3 TO 25]SETREG 23 INBUS 'Enter 23 bit mantissa 2' [24] OUTBUS SETBUS 'Adding' [25] OUTBUS SETBUS SGN,'-',EXP[2 TO 9],'-',MANT[3 TO 25] [26] OUTBUS SETBUS SGN2,'-',EXP2[2 TO 9],'-',MANT2[3 TO 25] [27] * Swap operands if necessary [28] BRANCH(EXP[2 TO 9]LGT EXP2[ TO 9];OK) [29] BRANCH(EXP[2 TO 9]LLT EXP2[ TO 9];DS) [3] BRANCH(MANT[3 TO 25]LGE MANT2[3 TO 25];OK) [3] DS:EXP SETREG EXP XOR EXP2 [32] EXP2 SETREG EXP XOR EXP2 [33] EXP SETREG EXP XOR EXP2 [34] MANT SETREG MANT XOR MANT2 [35] MANT2 SETREG MANT XOR MANT2 [36] MANT SETREG MANT XOR MANT2 [37] * Sign of F determines sign of result [38] SGN SETREG SGN XOR SGN2 [39] SGN2 SETREG SGN XOR SGN2 [4] SGN SETREG SGN XOR SGN2 [4] * Line up exponents and add or subtract mantissas [42] OK:MANT2 SETREG(twoTOdec EXP SUB EXP2)RLSHIFT MANT2 [43] (SGN LEQ SGN2)MANT SETREG MANT ADD MANT2 [44] (SGN LNEQ SGN2)MANT SETREG MANT SUB MANT2 [45] BRANCH(MANT LNEQ ZERO 28;NORM) [46] SGN SETREG [47] EXP SETREG ZERO [48] MERGEAT D [49] * If necessary shift implied into position

129 Page 27 [5] NORM:(NOT MANT[])MERGEAT L [5] MANT SETREG RLSHIFT MANT [52] EXP SETREG INCREMENT EXP [53] * If necessary normalize to get into implied position [54] L:(MANT[2])MERGEAT RND [55] MANT SETREG LLSHIFT MANT [56] DECREMEMT EXP [57] MERGEAT L [58] * Round the result [59] RND:MANT[25]SETREG MANT[26] [6] D:OUTBUS SETBUS EXP[2 TO 9]MESG 'Exponent - ' [6] OUTBUS SETBUS MANT[3 TO 25]MESG 'Mantissa - ' [62] OUTBUS SETBUS SGN,'-',EXP[2 TO 9],'-',MANT[3 TO 25] [63] END: Example Usage: Operand = = 2. normalized is sign = biased exponent = = 34 = 2 mantissa = (implied leading ) Operand 2 = = 2 -. normalized is sign = biased exponent = + 27 = 27 = 2 mantissa = (implied leading ) RTL simulator results: Name of Machine: RtlADDIEEE Processing RtlADDIEEE specifications and statements. st numner - enter bit sign : Enter 8 bit exponent : Enter 23 bit mantissa : 2nd number - enter bit sign 2: Enter 8 bit exponent 2: Enter 23 bit mantissa 2: Adding Exponent - ( ) Mantissa - ( ) - - Normal Termination Verification: = = = = = =. 2 = / 6 + / 32 =

130 Computer organization: General Hardware Organization Example (separate I/O:memory bus and CPU:memory bus) Page 28 Computer hardware is generally organized in three component areas:. Memory 2. Central Processing Unit (CPU) 3. Peripherals Memory has already been described, and RTL provides the means for describing the CPU. The CPU components are given by the following: INPUT DEVICE MEMORY (shared data and instructions) OUTPUT DEVICE Instruction Counter Data Path IC SR Status Register CONTROL Instruction Decode IR Instruction Register MAR Memory Address Reg MDR Memory Data Reg Working Registers (Accumulator, Index) Communications Link Y Z ALU {Arithmetic and Logic Unit including support registers) CPU Central Processing Unit

131 Page 29 The memory and CPU can operate asynchronously (in effect, each has its own clock). For peripheral (I/O) devices, the CPU sends a signal to the I/O device and then a device controller independent of the CPU takes care of data transfer, which is to/from a designated memory location called an I/O buffer. When the transfer is complete, the I/O device sends a signal to the CPU to notify it that the buffer is now ready for access. If the CPU is performing an operation that requires the transfer to complete, then the CPU will need to pause operation (essentially by holding its clock at zero) until the completion signal is received is received. The user sees this as the system hanging. Hence, the CPU needs to be able to signal resource controllers (including memory). The signal can be as simple as taking a bit to, which when dropped back to (by the external resource controller) causes the CPU clock to resume. The elements within the CPU consist of A control unit An Arithmetic and Logic Unit (ALU) Registers for user program data (working registers) Registers for managing user programs The control unit: The control unit has circuitry for signaling data transfers to and from memory. Two registers are employed for controlling the transfer.. The memory address register (MAR), which has the memory address for the transfer 2. The memory data register (MDR), which has the data to be transferred to memory, or which receives the data transferred from memory. The memory data register is attached to the CPU-memory bus as a bus sense register accessible by both memory and CPU. The memory control unit also must be able to access the MAR to determine the memory address to use. The Von Neumann architecture stipulates that programs and data reside in the same memory area. The process of transferring a machine language instruction into the CPU is called an instruction fetch. The control unit has an internal working register, the instruction register (IR), where it stores the instruction fetched. The IR is attached to a circuit that decodes the instruction to extract the instruction operation determine the memory address the instruction is to act on The instruction address can then be transferred to the MAR to initiate the transfer of the needed data to the MDR. Arithmetic and Logic Unit: The arithmetic and logic unit contains circuits such as those described using RTL and combinational logic for useful computational work, such as arithmetic operations and logical comparison. The ALU is signaled as to which operation s output is to be captured in its output register.

132 Page 3 Registers for user program data (working registers): Conceptually, user program data must be placed in a work area where it can be retained to permit cascading operations that characterize complex arithmetic expressions. Each of these registers is usually attached to the CPU bus, which effectively limits their number (the IBM 36 architecture provides 6, for example). Some of these register may have special purposes; for example, an accumulator is a register in which the ongoing outcome of a computation is accumulated; an index register whose value is added to the instruction address to allow stepping through a sequence of memory locations (usually representing a data table). The idea is to do as much work in the CPU as possible to avoid transfers to and from memory; for example, a swap sequence through a temporary memory location Read m Transfer m to T Write T Read m2 Transfer m2 to m Write m Read T Transfer T to m2 Write m2 requires 6 Read/Write operations, whereas using working registers R and R2 Read m to R Read m2 to R2 R R R2 R2 R R2 CPU time for these is negligible compared to Read/Write R R R2 Write R to m Write R2 to m2 requires only 4 (plus no temporary location is needed). Registers for managing user programs: The Instruction Counter (IC) has the address of the machine language instruction to fetch after the instruction currently in the IR is finished. When an instruction is fetched, the IC is updated to point to the address of the next instruction. A branch instruction is simply one that can modify the value in the IC. The Status Register (SR) is set by the control unit to flag results of comparisons, overflow conditions, and the like. To facilitate register, CPU elements are connected along one or more bus structures, with access to a bus controlled by 3-state logic blocks that allow a register s value onto the bus and select the registers to which it is transferred from the bus. A single bus organization has a structure such as the following:

133 Page 3 Single Bus CPU Organization CPU BUS SR I C Instr Decoder & Operand Address I R M A R M D R R R C C C R n Y a ALU c Z b C C C Address Lines Data Lines CPU-memory bus Control Lines for ADD, SUB etc. This organization provides means for moving values in and out of selected registers. No more than register can be gated onto the bus at any one time, or the signals will conflict. Any number of registers can simultaneously be loaded from the bus, however. Binary operations take one operand from register Y and the other from the bus. Registers such as the IR do not have a transfer to the bus because there is no reason to be transferring their contents back out of the register. The IC is not in this category, because its contents must eventually be transferred to the bus and into the MAR as part of fetching the next instruction to execute. The Register-Bus gating is as follows: Register-Bus Gating R in (enable) R out (enable) R Data transfer example: Y R R out, Y in Here R in enables the R Write/Enable on the clock signal. R out activates a 3-state logic connection from R to the bus.

134 Page 32 From the diagram we can determine the gating signals. Memory I/O control signals for Read from memory and Write to memory are also needed, along with ALU commands. A microcounter is used to select the current line of microcode from a table and control signals are needed to selectively reset the counter. Gating signals: IC out, IC in, Addr out, IR in, MAR in, MDR out, MDR in, R out, R in,..., R i out, R i in,..., Y in, Z out, Z in Memory I/O control signals: Read, Write, WaitM (hold CPU clock at until memory read is done) ALU commands: Add, Sub, Set carry (to ), ShiftR Y, ShiftL Y, Clear Y, Compare, GT, LT, EQ, NE Micro counter control signals: End A line of microcode consists of a sequence of bits which give the values for each of the gating signals, the Memory I/O control signals, the ALU commands, and the micro counter control signals ( means the signal is active, means it is inactive). Microcode organized in this fashion is called horizontal microcode. Several lines of microcode are needed to specify a machine language instruction. A machine language instruction is divided into two parts:. the op code (specifies what the instruction is to do) 2. the operand (identifies the location of the data to be acted on) In a basic machine, a machine language instruction occupies a single word of memory. If the word length is 32 and the op code takes 8 bits, then 256 different machine language instructions can be provided. The operand takes the remaining 24 bits. Since operands represent memory addresses, 2 24 = 6,777,26 different memory locations can be directly addressed. Larger memory address space can be accomodated by using operands that represent relative addresses rather than absolute addresses. Once the instruction is brought in from memory and transferred into the IR, the instruction interpreter can decode the op code part to point the microcounter to the right microprogram; the operand is gated to the bus when Addr out is signaled.

135 Example: Suppose that the machine language instruction ADD <addr> Page 33 means increment the value in R by the value pointed to by <addr>. A register used in this fashion is sometimes called an accumulator. A microcode sequence for the ADD instruction is as follows: Instruction fetch Accumulate IC out, MAR in, Read, Clear Y, Set carry, Add, Z in Z out, IC in, WaitM MDR out, IR in Addr out, MAR in, Read R out, Y in, WaitM MDRout, Add, Z in Z out, R in, End Operand fetch Each line of microcode represents the signals which are on. All others are presumed to be off. The st line of microcode initiates instruction fetch and does the following (simultaneously): the IC is gated onto the bus and into the MAR (IC out, MAR in ) memory is signaled to READ the value addressed by MAR into the MDR the ALU s Y register is cleared to the carry-in for Add is set to the ALU Add circuit is selected (calculating bus + Y + = address of the next instruction) the ALU result is gated into Z. This all takes place in CPU cycle. The idea is to do as much in each step as possible to minimize the number of CPU cycles required. The 2 nd line of the instruction fetch does housekeeping, gating the address of the next instruction (as calculated by the st line) out of Z and into the IC, after which nothing else can be done until memory releases the Wait signal (Z out, IC in, WaitM). The simplifying assumption here is that each instruction occupies a single word and the machine is word addressable (as opposed to byte addressable ). This speeds up instruction fetch since adding to the IC points the IC to the next instruction. If variable length instructions are to be employed, then the IC increment must wait until the IR is fetched. Moreover, the instruction interpreter must provide the instruction length via a new transfer link (IL out ) for the address calculation. To complete the instruction fetch, the 3rd line of microcode gates the retrieved instruction from the MDR onto the bus and into the IR (MDR out, IR in ). The same instruction fetch sequence starts the microcode for every machine language instruction!

136 Page 34 The 4 th line begins the operand fetch, where the operand address determined by the instruction decoder is gated onto the bus and into the MAR (Addr out, MAR in ) memory is signaled to READ into the MDR the value addressed by MAR. To set up for the accumulate, on the 5 th microcode line the value in R is gated onto the bus and into Y (R out, Y in ) the CPU clock is suspended by issuing a Wait signal. For accumulate, the 6 th line of microcode becomes active when the Wait signal is released, at which point the MDR is gated onto the bus, ALU Add is triggered, and the result is captured in Z (MDR out, ADD, Z in ). The 7 th and final line of microcode finishes the accumulate, where Z is gated onto the bus and into R (Z out, R in ) End triggers instruction fetch on the next CPU cycle. Since each line of microcode requires a CPU cycle, the accumulate instruction in this example requires 7 CPU cycles, 3 cycles for instruction fetch and 4 for the accumulate procedure. Designers spend a great deal of effort to make architectural adjustments which serve to reduce the number CPU cycles required by machine language instructions, since each machine language instruction will be executed countless times in the operation of a computer. In particular, since instruction fetch is used for every instruction, it is advantageous that the machine architecture be structured for an instruction fetch requiring as few CPU cycles as possible (hence, they devise strategies such as pipelines, which can be filled while the current instruction is being processed, taking advantage of the predictable nature of instruction fetch). A rule of thumb in writing microcode is that multiple in signals are permitted on a line of code, but only one out signal. Microprograms: The microcode for a machine language instruction such as ADD is called a microprogram. It can be stored in a table accessed by a counter. The instruction fetch is a microprogram in its own right, and for this architecture would occupy the st three lines of the table. When the instruction fetch triggers IR in, the instruction interpreter decodes the op code now in the IR to set the counter to point to the microprogram for the machine language instruction. When the last line of the microprogram is accessed, the End signal causes the counter to reset to so the process will repeat, resulting in fetching and processing the next instruction. Instruction fetch assumes that the IC contains the address of the instruction to transfer in from memory, so the question can be asked, how does it get an address in the first place?. The answer to this is that the machine has a start/reset button, which forces a hard-wired

137 Page 35 address value into the IC when it is pressed. This address points to a bootstrap program that has been stored in memory (usually as ROM, so it can t be altered accidentally), which gets the program flow going for a user. The reset address has to be specified by the CPU designer and the boostrap program has to be provided by the computer manufacturer (on a PC as part of the ROM BIOS ). Translation from a programming language: Programming languages such as C compile user programs into machine language instructions such as ADD. For example, the C statement x = x + 3; could compile to the 3 machine language statements: LOAD <addr of x> ADD <addr of the constant value 3> STORE <addr of x> The address values are those assigned by the C compiler (which is just another program). A compiler translates program statements to machine code, and among other things assigns an address location to each constant and variable, initializing memory for each constant as part of the process. Before the compiled program can be run, it has to be located in memory so that addresses match those assigned by the C compiler. A piece of system software called a loader handles this. If LOAD <addr> means transfer the value at <addr> to R and STORE <addr> means transfer the value in R to memory at <addr> the above three statements. transfer x to R 2. add 3 to R 3. transfer R to x and x has been incremented by 3. Microcode for LOAD and STORE is as follows: LOAD <addr>: STORE <addr>: Addr out, MAR in, Read, WaitM MDR out, R in, End R out, MDR in Addr out, MAR in, Write, WaitM, End

138 Page 36 Branching: The Status Register (SR) has condition code (CC) bits which are set by the ALU Compare instruction. The CC value in conjunction with the ALU instructions EQ, LT, GT, and NE provide the means for conditional branches. When one of the following combinations are in effect: CC= and LT CC= and GT CC= - and EQ CC= - and NE ALU input from line "b" is routed to ALU output "c". Otherwise, for LT, GT, EQ, and NE ALU input from line "a" is routed to ALU output "c". Hence, LT, GT, EQ, and NE cause either the bus (input a) or register Y (input b) to be routed to the ALU output (output c) depending on the current CC value in the status register (SR). LT, GT, EQ, and NE allow the construction of conditional branch instructions. The routing patterns are summarized by the following diagram: ALU routing for CC (Compare result) EQ NE LT GT = (bus = Y) Y Z bus Z bus Z bus Z (bus = Y) Y Z bus Z bus Z bus Z (bus < Y) bus Z Y Z Y Z bus Z (bus > Y) bus Z Y Z bus Z Y Z Compare sets the first CC bit to if the value in Y and the value on the bus are equal. If they are unequal, COMP sets the first CC bit to and sets the 2 nd bit to specify either bus < Y or bus > Y. Example: Suppose that the machine language instruction BGT <addr> means branch on greater than to the instruction whose location in memory is given by <addr>. Here the assumption is that the CC bits in SR were set by Compare in an earlier instruction and if they are, <addr> is gated into the IC to replace the address of the next instruction computed during instruction fetch. Microcode is as follows: Set branch options Set branch <instruction fetch> Addr out, Y in IC out, GT, Z in Z out, IC in, End If the value of the CC is then GT causes Y to be routed into Z and the branch is taken. For any other CC value, the bus (which has the current IC value) is routed into Z and the branch is not taken.

139 Page 37 Microcode programming: A microcode programmer establishes the microcode for the machine language instructions that comprise the machine language for a given computing device. The table holding the microcode is sometimes called the control store and resides in the CPU for rapid access via the microcode counter. The End signal provides a -cycle microbranch to the instruction fetch. Additional microbranches may be provided to permit reuse of microcode sequences in addition to the one for instruction fetch. This is typically the case for microprogrammable machines, which provide means for making (limited) changes to the machine s control store. Note that the instruction interpreter is already making microbranches that aren t reflected in the microcode. Machine language instructions are represented by mnemonics such at BGT, BLE, COMP, SUB, MOV2, ADD, RSHIFT, J and so forth. Instruction fetch is the same for all instructions, and its CPU cycle consumption adds to the CPU cycles consumed by the instructions microprogram. To this point machine language instructions ADD, LOAD, STORE and BGT have been described. A sampling of others follows. Other machine language instructions: BLE calls for a branch if the condition is not BGT. This could be done by testing for LT and then testing for EQ, but it is better handled by reversing the branch options for BGT. BLE <addr>: Set branch options Set branch IC out, Y in Addr out, GT, Z in Z out, IC in, End If CC is then GT causes the current IC, which is in Y, to be routed into Z, meaning the branch is not taken. Hence, the branch is taken for not GT. Since LE is the same as not GT this microcode implements BLE. Compares can always be structured as a comparison with (eg., A > B is the same as A-B > ). Hence, for a COMP machine language instruction, the strategy can be comparison of the operand with. The microcode is then COMP <addr>: Addr out, MAR in, Read, WaitM MDR out, Clear Y, Compare, End Here the comparator in the ALU is comparing the bus to Y=, setting the CC accordingly. Since COMP only compares to, it becomes the machine language programmer s responsibility to convert A > B to A-B >. Machine language instructions to accomplish this are as follows LOAD A SUB B STORE TEMP COMP TEMP this reverses what what BGT put on the bus and in Y

140 Page 38 Subtract is not quite the same as add, because the order of operands matters. The normal assumption is that SUB <addr> means subtract the value at <addr> from R. Assume also that the ALU Sub signal causes the value on the bus to be subtracted from Y. The microcode is SUB <addr>: Addr out, MAR in, Read R out, Y in, WaitM MDR out, Sub, Z in Z out, R in, End If there is a COMPR instruction for comparing R to, then the machine language program can be improved; eg., it can be shortened to LOAD A SUB B COMPR where now the TEMP memory location has been eliminated. Microcode for COMPR is particularly simple: COMPR : R out, Clear Y, Compare, End Only CPU cycle is required (other than the CPU cycles for instruction fetch). This is characteristic of register to register machine language instructions. Register to register instructions are advantageous because they require no memory access (other than instruction fetch). If MOV2 means copy R to R 2, then its microcode is MOV2 : R out, R 2 in, End It should be noted that as with COMPR, only CPU cycle is needed. In the architecture as described, registers have to be explicitly identified by the mnemonic for machine language instructions, which is why neither COMPR nor MOV2 required an operand. To identify registers dynamically, for example, in an instruction such as MOV <reg-a>,<reg-b> a mechanism has to be added to the architecture for dynamically matching <reg-a> and <reg-b> to working registers. If <reg-a> is specified by 4 bits (providing 6 possible register ids), an 8 bit operand is sufficient to specify the pair (<reg-a>,<reg-b>). The architectural addition needed is an 8-bit register R attached to the bus to serve the purpose of identifying <reg-a> and <reg-b>. The microcode for a machine language instruction using register operands then has to load R with the ids of the registers to be used. To see how this might work, suppose R a in is a signal that triggers a decoder which accesses the first 4 bits of the register identifier and R b in does the same for the second 4 bits. In

141 Page 39 other words, If in the st 4 bits of R identifies R, then R in is triggered by the 4 to decoder for an R a in signal. To identify the register pair <reg-a> and <reg-b>, only the st 8 bits of the MOV instruction s operand field have to be set; eg., for MOV,2 the operand bits are. If MOV <reg-a>,<reg-b> means copy the contents of <reg-a> to <reg-b>, then microcode for MOV <reg-a>,<reg-b> is: Addr out, R in (move the immediate value to R) R a out, R b in, End Addr out in this case is putting register ids on the bus rather than an address (the operand is the immediate value). This is called immediate addressing (meaning the address for the operand is the immediate location on the instruction itself). The machine language code for copying R to R 2 is then MOV,2 or for copying R 2 to R o is MOV 2, This kind of enhancement characterizes the release of an extended version of an existing computer architecture, where upward compatibility is being sought. If RSHIFT means to shift R right by the value of the operand, then assuming the ShiftR Y command shifts Y by the value given by the bus, the microcode sequence is RSHIFT <amt>: R out, Y in Addr out, ShiftR, Z in Z out, R in, End Note that in this case the address portion of the instruction is treated as a number. This is another example of immediate addressing, where it is the immediate value, rather than a value in memory, that is of interest. Immediate addressing provides an easy means for establishing values in registers. For example, if the instruction LOADI <addr> means load the immediate value (namely <addr>) into R then the specific value encoded on the instruction is loaded into R. More explicitly, LOADI 3 provides means to initialize R (in this case to the integer 3). If the address given by the instruction is the address of the data item in memory, then the addressing is called direct addressing. In some circumstances, it is desirable that the address part of the instruction point to a memory location that holds the address of the desired data item. This is called indirect addressing. For example, BGTN could specify a branch on greater than, not to the address, but to

142 the address stored at the address. This is useful for providing a table of jump addresses that point to different routines to be invoked depending on machine state. In contrast to BGT, a memory access is required to get the address to jump to: BGTN <addr>: Set branch options Set branch Addr out, MAR in, Read, WaitM MDR out, Y in IC out, GT, Z in Z out, IC in, End get the indirect address Page 4 A jump instruction is an unconditional branch and is very simple to construct: J <addr>: Addr out, IC in, End Index register: To process a table, it is useful to have an index register whose value is automatically added onto the <addr> operand before it is transferred to the MAR. For example, suppose that the instruction ADDX means accumulate in R indexed by R 2 ; ie., R 2 is designated to be the index register. <addr> for ADDX provides a base address for a table. A specific entry in the table is obtained by adding R 2 to the base address. The microcode for ADDX is: ADDX <addr>: Adjust address by index Accumulate Addr out, Y in R 2 out, Add, Z in Z out, MAR in, Read R out, Y in, WaitM MDR out, Add, Z in Z out, R in, End If indirect addressing is combined with indexing, a jump table can be easily processed. A jump table typically holds the addresses of the programs that a process must select from among dynamically (eg., an operating system service routine to process an interrupt flag raised by a device controller). If JNX <addr> means jump to the machine language instruction located at the address specified by <addr>, then the microcode is JNX <addr>: Adjust address by index Operand fetch Addr out, Y in R 2 out, Add, Z in Z out, MAR in, Read, WaitM MDR out, IC in, End get the indirect address and move it to the IC The instruction operand provides the base address for the table. Incrementing the operand by the index (line 2) changes the address to a location further along in the table. This is the address of the value to be retrieved and it is sent to the MAR to retrieve the table entry (line 3). The retrieved value (line 4) is then transferred to the IC so that the instruction executed next will be the one whose memory location is stored in the table.

143 Page 4 If the table entries are the addresses of programs, then the effect of the jump is to start up the program whose address was retrieved from the table. Logically, we have the following hierarchy: immediate address op code operand immediate value direct address These examples demonstrate why it is desirable to have machine language instructions that utilize immediate or indirect addressing. Suppose that designated bits within the opcode specify if addressing is to be immediate, direct, or indirect. The additional micro counter control signals can be added which respond to these. Let End i reset the micro counter to if the designated bits specify immediate addressing. Let End d reset the micro counter to if the designated bits specify direct addressing. With these additional micro branches, a single microprogram can serve for all 3 addressing modes. A typical way to specify the addressing mode is to append a qualifier to the instruction mnemonic; eg., LOAD* for immediate, LOAD for direct, and LOAD@ for indirect. Microcode for LOAD that uses this capability is as follows: LOAD<qual> <addr>: indirect address Addr out, MAR in, R in, Read, WaitM, End i MDR out, MAR in, R in, Read, WaitM, End d MDR out, R in, End On line, the program ends with the immediate value (<addr>) in R in if addressing is immediate. On line 2, the program ends with the value directly fetched from memory transferred into R. Otherwise the retrieved indirect value is transferred into R in line 3. The Read in line is anticipatory in case addressing is direct. If addressing is direct, the line transfer into R is overridden by line 2. Likewise, the Read in line 2 is anticipatory in case addressing is indirect, and if so the line 2 transfer into R is overridden by line 3.

144 Simplified Instructional Computer (SIC): SIC Hardware Organization (separate I/O:memory bus and CPU:memory bus) Page 42 A widely used architecture for instruction in systems architecture and programming is the SIC machine described by Beck (Systems Software: An Introduction to Systems Programming Addison-Wesley). This machine incorporates the kind of CPU elements that have been discussed and its machine language can be easily represented using the microprogramming techniquess just covered. First of all the SIC hardware organization can be represented by almost the same block diagram exhibited earlier. INPUT DEVICE MEMORY (shared data and instructions) OUTPUT DEVICE Data Path IC Program Counter SW Status Word CONTROL Instruction Decode IR Instruction Register MAR Memory Address Reg MDR Memory Data Reg A X L Working Registers (Accumulator, Index, Link) Communications Link Y Z ALU {Arithmetic and Logic Unit including support registers) CPU Central Processing Unit

145 Page 43 Note that only minor modifications are needed in the structure of this diagram. In essence, the working register set has been specified to consist of an accumulator A, an index register X, and a link register L. The Instruction Counter and Status Register are renamed and no other changes are necessary. For the CPU organization, the diagram becomes Single Bus CPU Organization for Implementing SIC CPU BUS SW P C Instr Decoder & Operand Address I R M A R M D R L X A Y a ALU c Z b C C C Address Lines Data Lines CPU-memory bus Control Lines for ADD, SUB etc. The SIC ADD instruction accumulates in A instead of R. COMP is exactly as described already, BGT is named JGT, and so forth. Load instructions are dubbed LDA, LDX, and LDL, respectively. Shift is not provided in the basic SIC machine, but is available under the extended version (SIC/XE), which requires means for dynamically identifying registers as related earlier. Arithmetic is available only for register A for the basic SIC machine, but is available for all registers under the extended architecture. Immediate and indirect addressing are available only for the SIC/XE version of the machine. We ve already seen the reason for having a register designated to provided indexing. The link register is one whose use includes automatic provision of the return address when jumping to a subroutine. In the basic SIC machine, this is called JSUB, which simply jumps to the address given by its operand after setting the link register. The microcode is as follows: JSUB <addr>: IC out, L in Addr out, IC in, End Instruction fetch has already produced the address of the instruction immediately following the JSUB (the so-called return address). It is a simple matter to transfer it to register L before changing the IC to cause the jump to the subroutine.

146 The counterpart to JSUB is RSUB, which jumps to the value given by register L. Its microcode is: RSUB: L out, IC in, End Dual Bus CPU Organization OUTPUT BUS Page 44 Note that RSUB requires no operand. If the subroutine also invokes JSUB, then it must first save register L and restore it before executing RSUB, or the subroutine will return to itself! High level languages provide a stack structure so that the programmer doesn t have to worry about this detail (the current value of L is pushed onto the stack as part of the subroutine call, and is popped off of the stack as part of the subroutine return). Architecture enhancements: By separating the bus into an input bus and an output bus (which can be selectively tied together), CPU cycles can be saved. Consider SR I C Instr Decoder & Operand Address I R M A R M D R R R C C C R n Y a ALU b c bus tie Bt C C C INPUT BUS Address Lines Data Lines CPU-memory bus Control Lines for ADD, SUB etc. Register Z has been eliminated in favor of the input bus. Recall for the single bus architecture we had ADD <addr>: Instruction fetch Accumulate IC out, MAR in, Read, Clear Y, Set carry, Add, Z in Z out, IC in, WaitM MDR out, IR in Addr out, MAR in, Read R out, Y in, WaitM MDRout, Add, Z in Z out, R in, End Operand fetch

147 Page 45 For the dual bus modification, ADD becomes Instruction fetch Accumulate Bt enable, IC out, MAR in, Read IC out, Clear Y, Set carry, Add, ALU out, IC in, WaitM Bt enable, MDR out, IR in Bt enable, Addr out, MAR in, Read Bt enable, R out, Y in, WaitM MDR out, Add, ALU out, R in, End Note that altering the architecture in this manner reduces the CPU cycles for ADD by. Basically, splitting the bus eliminates the need for register Z. Two out signals are now allowed (as is the case for both line 2 and line 6), but only if they are on different buses and the buses are not tied. Another technique that can be used is to utilize both halves of the clock cycle (half the time it is high, the other half low). By dividing the circuitry into components that activate on logic high (positive logic) and components that activate on logic low (negative logic), speed may be almost doubled. For example, the two lines of microcode Addr out, Bt enable, MAR in, Read R out, Bt enable, Y in, WaitM do not have any register transfer signal conflicts (an in signal for the same register on each line), so the first could be accomplished while clock is high and the second while clock is low. This can be done by setting up the microcode table as two tables, the first of which provides microcode signals on clock high and the second on clock low. Each half of the table is addressed via the microcounter. Hence, table entries that have the same address represent consecutive lines of microcode. A microprogram will now need to have an even number of lines, with End appearing on the last line, even if it is the only control signal on the line. Under this strategy, the same register cannot be set on consecutive lines of microcode. Also, since a register sets up when its flip-flop CK lines go low, an out for a register should not be on the line immediately following an in. For these reasons, either the first entry or the second entry of the pair may need to be left <null> (all signals off). To illustrate, if this approach is used, single bus ADDR becomes: Instruction fetch Accumulate IC out, MAR in, Read, Clear Y, Set carry, Add, Z in <null> Z out, IC in, WaitM MDR out, IR in Addr out, MAR in, Read R out, Y in, WaitM MDRout, Add, Z in <null> Z out, R in End Operand fetch Operand fetch Operand fetch CPU time to execute the microprogram is reduced from 7 CPU cycles to 5. For the dual bus scenario, it can be shown that the time can be reduced to 3 CPU cycles with only minor code rearrangement.

148 Page 46 CPU-memory synchronization: At the microcode level, the CPU can trigger memory Read, Write, and WaitM signals. Circuitry for how the signals are used to synchronize the data transfers is as follows: Read Write WaitM If Read or Write is active and memory is busy (i.e., Enable is ), taking WaitM to disables the CPU clock, effectively putting the CPU to sleep until Mhold is cleared by Enable going to. Clock D Q CK Q CPU side CPU Clock Memory side Read Write set D Q CK Q clear Mhold Note: Mhold is set whenever memory is busy and Read or Write goes to. Synchronization between the CPU and memory occurs when Enable clears Mhold by going to. D CK Q Q Mx Memory setup occurs while Enable is. The memory action occurs when Enable goes to. Enable Clock 2 Read= WaitM= Read= & WaitM= CPU Clock Mhold Mx Enable Clock Mx follows Mhold on trailing edge of Clock 2. Enable falls to when Mx= and Clock 2 rises (clearing Mhold)

149 Page 47 The timing considerations are given by the timing diagram, which shows two typical cases on the CPU clock line: A Read issued on a CPU cycle followed by WaitM issued on the next CPU cycle; eg., Z out, MAR in, Read R out, Y in, WaitM Both Read and WaitM issued on the same CPU cycle; eg., Addr out, MAR in, Read, WaitM If Mx =, Enable holds at. If Mx =, then when the asynchronous clock signal (Clock 2) rises to, Enable falls to and the Mhold ff is cleared, with Mx falling to when the clock falls to. Hence, when neither Read or Write signal is active, Enable = and Mhold =. This is how the timing diagram starts. In the first case, when Read goes to in the CPU, the Mhold ff is set to and Mx goes to when Clock 2 falls. This in turn takes Enable to and memory sets up MDR based on the value in the MAR. When Clock 2 rises again, Enable falls to (note that memory has had full set-up time as represented by Clock 2) and Mhold is cleared. WaitM has been issued, but has no effect since the CPU clock only responds to Mhold when Clock falls. It should be noted that the CPU clock operates in phase with Clock except when held low by the synchronizing ff. In the case illustrated, no CPU cycles are lost and the MDR is available on the next clock cycle. In the second case, when Mhold rises to (suspending the CPU clock because WaitM = ), Mx doesn t rise as quickly to, because asynchronous match up of Clock and Clock 2 is in a worst case scenario and Mx only rises when Clock 2 falls again. When Mx does rise to, memory has had a full setup period (Clock 2 low) for when the CPU clock resumes in synch with Clock. Note that 2 CPU cycles have been lost. Computer architects seek to define means to eliminate these kinds of wait states between memory and CPU. This may involve using cache memory (a fast intermediate memory between main memory and the CPU) to reduce clock differences. Of course, when the data is not in cache, the cache has to be reloaded, which may cost some CPU cycles. Another technique is to pipeline data into CPU registers, so that in many cases the next item is in the pipeline. The cache can be loaded while the pipeline is being processed, and if more than one pipeline is employed, one pipeline can be loaded while another is being processed. If successive memory retrievals cross wide stretches of memory, then neither caching or pipelining will help (and may actually hinder, because loading them requires time). This is normally not an issue, since typical programs operate within a compact area of memory.

150 Page 48 Inverting microcode: Microcode can be inverted to form a large logic circuit by examining what microcode signals are on at each time step T, T 2,..., T n. The microprogram sequences are examined at each of T, T 2,..., T n for those signals each microprogram turns on. For example, Z out is on at T 2 for every case (since it is in line 2 of instruction fetch). It is also on at T 6 for BGT, BLE, RSHIFT, JNX, at T 7 for ADDR and BGTN, and at T 9 for ADDX. The Z out signal is then set in the large logic circuit via the combinational equation Z out = T 2 + T 6 (BGT+BLE+RSHIFT+ADDX+JNX) + T 7 (ADD+SUBB+BGTN) + T 9 ADDX +... Similarly, IC out is set in instruction fetch and in branch instructions, leading to the combinatorial equation IC out = T + T 4 (ADD+JSUB) + T 5 (BGT+BGTN) +... Specialized signals such as WaitM and End are also represented in combinatorial equations: WaitM = T 2 + T 4 (LOAD+COMP+BGTN) + T 5 (ADD+SUB+STORE) + T 6 JNX + T 7 ADDX +... End = T 4 (COMPR+MOV2+J) + T 5 (LOAD+STORE+COMP) + T 6 (BGT+BLE+RSHIFT) + T 7 (ADD+SUB+BGTN+JNX) + T 9 ADDX +... In this manner the combinational logic for setting signals at each point of the instruction counter is described. A block diagram for the CPU as a circuit is then given by: inhibit clock counter... decoder reset T T 2... T n I R... Instr Decoder ADD BGT... Logic circuit to set control signals... status flags (eg., Mhold) SUB... condition codes WaitM... End (& Mhold)

151 Page 49 Either approach will control the register transfer requirements specified by the microcode. In contrast to using a generic component which applies microcode from a table to set control signals, the large circuit is cast in concrete as a combinational circuit. The gain is in efficiency. The loss is that making changes to the system microcode requires major circuit modification. Modern microprocessors employ microcode tables imbedded in firmware, so that a need to make changes to microcode only involves modifying the imbedded table rather than the other circuitry. As a case in point, some years ago when the Intel Pentium was found to have a computational bug in its floating point routines, Intel was able to very quickly issue replacement processors which corrected the problem because the floating point operations were defined by microcode. Vertical vs. horizontal microcode: The microcode as examined to this point has been viewed horizontally as a sequence of bits. Manufacturer s often group logically related signals in much the manner used earlier to dynamically identify registers. For example, a of 6 decoder can be used to select a signal using just 4 bits, a reduction of 2 microcode bits. This is OK so long as no more than of the 6 signals needs to be selected at a time. In particular, since only out signal can be selected at a time, all out signals could be selected in this fashion. Microcode employing this technique is called vertical microcode. Managing the CPU and peripheral devices: The CPU is the central resource for a computer, and its failure precludes any utilization of the system otherwise. Moreover, whenever the CPU clock has been inhibited, the system is effectively shut down, so steps that reduce the probability of this occurring are advisable. For example, if the CPU sends a signal to a printer and the CPU clock is inhibited until the printer responds, no matter the state of the rest of the system, the computer is effectively shut down until a signal is received from the printer (perhaps the printer has not been turned on, or there is a cable problem). This tactic was commonly employed by earlier computers. Memory-CPU synchronization is always necessary because of the tight coupling between the memory and CPU for instruction fetch, which means a possibility always exists for the synchronization circuitry to inhibit the CPU clock. Tactics such as instruction pipelines and memory caches are used to minimize this possibility. Peripheral devices are not tightly coupled to the CPU, so peripheral- CPU synchronization does not have to be directly achieved. The tactic employed is called direct memory access. Direct memory access takes

152 Page 5 advantage of the fact that memory is not driven by a counter (in contrast to the CPU). For this reason data transfers between a peripheral device and memory can take place without suspending the memory clock to wait for the device to respond. Since peripheral devices operate at considerably slower speeds than either CPU or memory, a number of clock cycles may go by before a device response takes place, during which time there can be continued memory-cpu activity. When using direct memory access, peripheral-cpu synchronization is taken care of indirectly by memory-cpu synchronization, and the CPU clock does not need to be inhibited while waiting for peripheral device response. When a program initiates a peripheral data transfer, the program usually must pause until the transfer has been accomplished. The CPU provides the signals that control a peripheral device s behavior and there may be a driver program that causes the peripheral to step through its physical requirement. A peripheral device usually has its own controller, which responds to the signals received from the driver. Regardless of strategy, a program handling a peripheral data transfer will reach a point where it can go no further without a response from the peripheral device. To meet the objective of keeping the valuable resource, the CPU, from being held up by slow peripheral response times, it is evident that means are needed to switch from a waiting program to one ready to run. This is normally accomplished by maintaining multiple programs in memory, devising means both for keeping track of these programs and for switching off to one of them when the currently executing program must pause. This is a primary task of the modern operating system. At the core of the operating system there is a supervisor program whose job is simply to manage other programs that are in memory. When a program wants to access a peripheral device it does so by executing a supervisor call (SVC) machine language instruction. The supervisor does housekeeping (saving the state of the program that executed the SVC), initiates the peripheral data transfer, and turns the CPU over to a new program (via a machine language instruction such as JNX, after restoring the state of the new program). In this way the CPU no longer gets suspended by programs that initiate peripheral data transfers. When a program is suspended, the current machine state (register values, including SR and the IC) must be saved. Note that the microcode for the SVC must save the program s current IC (in the manner of an RSUB) since starting the supervisor program changes the IC. Also, means must be provided to capture the SR. In making the

153 Page 5 context switch to the new program, the supervisor must restore the machine state of the program being resumed. This information is maintained in state tables that are under control of the supervisor. It is important that the supervisor program periodically resumes execution so that every program in memory gets a turn with the CPU. Since SVC commands for peripheral devices access may occur erratically, a timer is needed so that in the absence of any program executing an SVC, program control returns to the supervisor after a defined period of time has elapsed. This implies that a timer interrupt is needed to force a null SVC if a peripheral access has not occurred in the meantime. Both the timer and an interrupt capability represent an added hardware need. At the hardware level, an interrupt is just a signal which when present redirects the End microbranch to branch to a microprogram that captures the IC and starts the supervisor (via a microbranch to the SVC microprogram). The interrupt capability can also be used as the means for a peripheral device to signal that it s done. When the supervisor program is run in response to an interrupt from a peripheral device, it conducts a context switch and resumes the program that executed the SVC which originated the peripheral device access. To determine the source of an interrupt, the supervisor needs to maintain information to match peripheral devices and programs that have a pending SVC action. For a timer interrupt, the supervisor simply needs to make a context switch to another program that is ready to run. Since the supervisor program should not be interrupted, means are also needed to mask interrupts while the supervisor program is executing. A mask is just a (bit) signal which when present keeps an interrupt from manifesting itself; e.g., interrupt mask Masking bits are set by the microcode of the SVC instruction, to be relinquished when the supervisor completes the context switch. The supervisor also must be able to deactivate an interrupt signal it has serviced so that the interrupt won t immediately manifest itself again on release of the mask. These kinds of considerations are covered in the context of an operating systems course. In addition to providing this capability, the hardware also needs to support a capability of having privileged

154 Page 52 instructions (instructions that can only be used if the privilege signal has been activated the SVC turns on this signal, in particular, so that the supervisor program can run privileged instructions). Privileged instructions (e.g., direct I/O instructions) are ones reserved for use of operating system software. They typically are instructions whose use in ordinary programs could compromise the operating system s ability to manage the CPU (eg., using a privileged I/O instruction leads to an interrupt when the I/O operation completes; the supervisor only has the means for handling the interrupt if it is the one issuing the I/O instruction). A +5V commercial microprocessor the Z8: The Zilog Z8 microprocessor is an 8-bit processor that was first issued in 976. Running a superset of the Intel 88 instruction set, the chip was in wide use by 98, perhaps most notably in the Radio Shack TRS-8, which was the first personal computer made available via a mass distributor, foretelling the future direction computing was to take with desktop machines. The Z8 s advantages (low cost, +5V compatibility) have made it a favorite to this day, although it is now used primarily for embedded applications where processing power is not an issue (eg., device controllers). The features of the Z8 are as follows: 8-bit CPU in a 4 pin package 6 address lines 8 data lines 3 control lines power, ground, clock 58 instructions forming a superset of the Intel 88 64Kb address space Another interesting feature of the Z8 is that it has a duplicate set of registers to support making a context swap on an interrupt; ie., the registers of the interrupted program do not necessarily have to be saved (however, if a 2 nd interrupt can occur, a register save will be needed).

155 Page 53 The chip pin-outs are as follows: A 9 MREQ Addressing (pins 3-4, -5) A IORQ RD WR Memory & I/O Control A 5 Z WAIT BUSRQ BUSAK Bus Control Data (pins 4,5, 2,7-,3) D D INT NMI Reset M RFSH Interrupt Control Miscellaneous Controls D HALT CK GND +5V Bus control enables the CPU to share the data bus with another device. To access the bus, the device signals its request via BUSRQ. When the CPU finishes its current operation it floats the address lines, data lines, I/O control and memory control lines, and signals back via BUSAK. The device is responsible for sending an interrupt signal to reactivate the CPU when it is finished with the bus. The MREQ line signals that the MAR is ready for a Read or Write operation. The IORQ line signals that the first 8 bits of the address bus have a valid I/O address for an I/O Read or Write operation. This signal is not used if memory-mapped I/O is being employed. The RD and WR lines apply to both memory and I/O operations. The WAIT line is used by memory or an I/O device to signal the CPU to enter a wait state until the signal is released (memory refresh continues via NOP operations see below). The INT line is for maskable interrupts (the command set provides the software controls).

156 Page 54 The NMI line is for non-maskable interrupts. The Reset line resets the internal CPU status and resets the instruction counter to. The M line is a signal that is output at the start of instruction read (more than one memory fetch is necessary to get the whole instruction). The Z8 allocates extra time to the op code read to provide time for refreshing dynamic memory. During the 2 nd half of the opcode read, a counter value is placed on the first 7 address lines for the memory bank in need of refresh and the RFSH signal is raised. The HALT signal stops CPU activity (except that NOPs continue to be executed to maintain memory refresh) an interrupt is needed for the CPU to resume. The Z8 can be clocked cycle by cycle via the clock input. Many designs of simple Z8 implementations have been devised. The following 6-chip design is from Tannenbaum. Data Bus (8 bits)... button A.. A A 4 A 5 M IORQ RFSH HALT BUSAK Z 8 INT NMI WAIT BUSRQ MEMRQ Reset WR RD f l o a t A.. A CS OE 2K 8 EPROM A.. A CS OE R/W 2K 8 RAM A A CS WR RD PIO Addressing: A 5 A 4 - = EPROM [controls are active on signal LOW] = RAM = PIO (memory mapped)

157 Page 55 The PIO is a (+5V compatible) chip providing parallel I/O ports. EPROM is electrically programmable read only memory, which can be erased using a strong ultraviolet light source and programmed using an EPROM programmer. SRAM is static random access memory, a designation for memory that does not need to be refreshed to maintain its values (ie., it is composed of flip flops). The counterpart, dynamic memory, requires periodic refreshing, and uses a different technology than gate logic. Dynamic memory provides greater capacity for less cost, but at the expense of speed. The design is complete except that a control program is needed for the EPROM (systems software for I/O, including a display device, in particular), a CPU clock is needed, and a power supply is needed (3 D- cell batteries will do). Note that address maps to the EPROM, which is where the Z8 initiates program load on Reset. Representative pricing for the configuration is as follows: Z8 $.39 PIO (MK 388) $ chip $ chip $.292K SRAM 26 2K SRAM $ K EPROM $2.25 $7.46

158 Page 56 INDEX П-notation Σ-notation BCD representation 's complement... 4-bit parallel adder bit parallel subtractor 's complement... 4 Absorption property... 8 Accumulator... 3, 33 Adder 4-bit parallel adder... 4 BCD adder carry anticipation... 4 full adder half adder Adders Sequential binary adder Addressing modes direct addressing... 39, 4 immediate addressing... 39, 4 indirect addressing... 39, 4 Alkaline battery Alternating current ALU... 96, 29 Amperes AOI gates Arithmetic and logic unit... 96, 29 ASCII... 3 Associative property... 7 Barrel Shifter Base address... 4 Batteries in series BCD adder BCD to 7-segment display decoder/driver Binary operations... AND... 4 COINCIDENCE... 4 NAND... 4 NOR... 4 One... 2 OR... 4 table of binary operations... 3 XOR... 4 Zero... 2 Boolean algebra... 6 absorption property... 8 associative property... 7 commutative property... 7 complement property... 8 DeMorgan property... 8 Distributive property... 7 duality... 7 generalized DeMorgan property... 2 idempotent property... 8 identity property... 8 involution property... 8 one... 6 zero... 6 zero and one property... 7 Boolean operations... 6 for circuits... 6 for sets... 6 for truth table logic... 6 Booth's method... 4 UNF RTL program... 6 Bootstrap program Branch instruction... 3 Branching Bus tie Byte... 4, 86 gigabyte... 4 K-byte... 4 megabyte... 4 terabyte... 4 Canonical forms Canonical product of sums Canonical sum of products Carry anticipation... 4 CC... 6 Central Processing Unit Character representation... 3 ASCII... 3 EBCDIC... 3 Characteristic table... 5 Chip select Circuit design combinational circuits Circuit simplification circular shift Clear signal Clock asynchronous speed Combinational circuit analysis Combinational circuits design process 33 Common cathode... 6 Commutative property... 7 Comparators Complement property... 8 Computer organization Context switch... 5 Control store Control unit Coulomb Counter design... 7 Counters Johnson counter mod 2 n ripple counter n-stage counter self-starting sequential design... 7 shift-register switch-tail counter CPU ALU commands arithmetic and logic unit arithmetic and working registers.. 29 control unit gating signals index register... 4 instruction counter... 3 instruction register machine language instruction managing peripherals... 49

159 Page 57 memory address register memory data register memory-i/o control signals micro counter control signals register-bus gating... 3 status register... 3 timer interrupt... 5 working registers... 3 CPU organization dual bus SIC CPU-memory synchronization D flip-flop Data bit... 6 Debouncing a switch Decoder of 2 n decoder BCD to 7-segment display Gray to binary Decoders/demultiplexers DeMorgan property... 8 Demultiplexer Device interrupt... 5 Direct addressing... 39, 4 Direct memory access Distinguished cell... 3 distributive property... 7 D-latch DMA Don't care cell... 3 Double precision floating point EBCDIC... 3 Enable End-around carry... EPROM EPROM memory Error correcting code Essential prime implicant... 3 Even parity Excess-3 BCD... 4 Excitation controls... 6 Extended precision floating point Field programmable gate arrays Finite state automaton Flip-flop Flip-flops... 5 D flip-flop edge-triggered excitation controls... 6 JK flip-flop... 6 Master-Slave T flip-flop... 6 Floating point numbers addition/subtraction... 97, algorithm for addition/subtraction 25 division... guard bits multiplication... multiplication/division normalization normalized form rounding strategies UNF RTL for addition/subtraction.. 26 Full adder Full subtractor... 4 Generalized DeMorgan property... 2 Gigabyte... 4 Glitch Glitches and hazards GND... 6 Gray Code... 5 Gray to binary decoder Ground... 6 Guard bits Half adder Half subtractor... 4 Hamming code Hazard Hertz Hexadecimal... 7 Horizontal microcode I/O buffer IC... 3 Idempotent property... 8 IEEE 754 floating point representation IEEE 754 Floating Point Standard IEEE floating point standard biased exponent exponent all 's implied leading Immediate addressing... 39, 4 Immediate value Implicant... 3 Implicate... 3 Implied leading Index register... 3, 4 Indirect addressing... 39, 4 Instruction Counter... 3 Instruction fetch... 29, 33 Instruction register Integer arithmetic Booth's method... 4 non-restoring division... 2 restoring division... 7 UNF RTL for Booth's method... 6 UNF RTL for non-restoring division 23 UNF RTL for restoring division... 9 UNF RTL for signed multiply... 3 Integer multiplication Booth's method... 4 signed multiply... 2 Integers... 6 's complement representation BCD representation 's complement representation... 9's complement representation... 4 base representation... 6 excess-3 BCD... 4 hexadecimal... 7 octal... 7 self-complementing representation.. 4 sign-magnitude representation... 8 Interrupt device... 5 mask... 5 timer... 5 Inverting microcode Involution property... 8 IR JK flip-flop... 6 Johnson counter... 76

160 Page 58 Joule s Law Jump instruction... 4 Jump table... 4 Karnaugh maps K-byte... 4 K-maps distinguished cell... 3 don't care cell... 3 essential prime implicant... 3 general procedure... 3 implicant... 3 implicate... 3 prime implicant... 3 Latch Latches... 5 D-latch SR-latch... 5 Leading edge Logic functions... composite functions... 5 truth table representation... 5 Logic gates... 2, 4 ANSI symbols... 4 Logic signals... 6 False... 6 high... 6 low... 6 True... 6 Machine language Machine language instruction Machine language instructions MAR Mask... 5 Master-Slave flip-flop Maxterm MDR Mealy circuit Megabyte... 4 Megaflop Memory... 83, 28 CD-ROM dynamic RAM FPGA PLA RAM ROM static RAM word size Memory address register Memory address space Memory data register Microbranch Microcode branching control store End signal horizontal horizontal microcode instruction fetch inverting to obtain a circuit microbranch vertical Microcode programming Microprogrammable machine Microprograms Milliamp Minterm Moore and Mealy circuits Moore circuit Multiplexers used to implement a logic function. 45 Multiplier... 4 NAND conversions Negative logic... 6 Next state equation Next state function NiCad battery Non-restoring division... 2 UNF RTL program NOR conversions Normalization Normalized form n-stage counter Numeric data... 6 integers... 6 real numbers... 6 Octal... 7 Odd parity Ohm s Law Ohms Operating system... 5 OR-AND conversions to NAND-AND to NAND-NAND to NOR-NOR Parity bit even parity odd parity Peripheral devices Peripherals Picosecond PIO Prime implicant... 3 Programmable logic arrays Propagational delay Pull-up resistor Quine-McCluskey procedure RAM memory Real numbers... 3 addition and subtraction fixed point representation... 3 floating point numbers guard bits IEEE 754 Floating Point Standard multiplication and division normalization normalized rounding strategies Register transfer architecture... Register transfer language... 2 Register transfer logic... Register-Bus gating... 3 Registers Residue classes... 9 Restoring division... 7 UNF RTL program... 9 ROM memory Rounding RTL... 2 implementing control logic... 4 implementing transfer logic... 5 UNF RTL... 6

161 Page 59 Self-complementing representation... 4 Self-starting counter Sequential binary adder Sequential circuit design Sequential circuit design process Sequential circuits analysis Set-Reset latch... 5 Setup time... 5 Shift-register counter SIC machine single bus CPU organization Signed multiply algorithm... 2 architecture... 2 UNF RTL program... 3 Single pole, double throw switch Single precision floating point SN7447 BCD to 7-segment display decoder/driver SR status register... 3 SRAM SR-latch... 5 Standard resistor values State diagram... 5, 66 Status register... 3 Subtractor bit parallel subtractor... 4 full subtractor... 4 half subtractor... 4 supervisor call... 5 supervisor program... 5 SVC... 5 Switch-tail counter T flip-flop... 6 Terabyte... 4 Timer interrupt... 5 Trailing edge Unary operations... 2 complement... 2, 4 identity... 2 UNF RTL... 6 arithmetic compare AEQ, ANEQ... 9 AGT, AGE... 9 ALT, ALE... 9 assignment statement... 7 basic structure... 6 Boolean logic operations AND... 9 COINC... 9 NAND... 9 NOR... 9 NOT... 9 OR... 9 XOR... 9 conditional branch... 7 conditional execution... 7 DECODE, ENCODE... 9 decrement by DECREMENT... 9 description... 6 dyadic operators... 9 expressions... 7 increment by INCREMENT... 9 labels... 7 logical and arithmetic shifts LASHIFT... 9 LLSHIFT... 9 LROTATE... 9 RASHIFT... 9 RLSHIFT... 9 RROTATE... 9 logical compare LEQ, LNEQ... 9 LGT, LGTE... 9 LLT, LLTE... 9 merge... 7 monadic operators... 9 naming registers and buses... 6 reformat of user input dectotwo, hextotwo... 9 register transfer... 7 string manipulation FIRST, LAST... 9 two's complement arithmetic ADD... 9 DIV... 9 MUL... 9 SUB... 9 twoscmpl... 9 twotodec, twotohex... 9 ZERO... 9 Vertical microcode Voltage Von Neumann architecture Watt hours Word size Z