1 Transparent Flip-Flop The RS flip-flop forms the basis of a number of 1-bit storage devices in digital electronics. ne such device is shown in the figure, where extra combinational logic converts the input signals into appropriate R and S signals to control the RS flip-flop, which is the data storage part of the circuit. The table to the left shows the operation of the circuit in response to the input signals. The input provides the data value, 0 or 1, to be written into the store, while the signal is a control signal, that determines when new data is written into the RS flipflop. This signal can be active or inactive, and like many control signals in electronics, it is active low, i.e. 0 on it causes the circuit to write the data value into the flip-flop, while 1 on it sets the flip-flop into storage mode. This can be identified in the table and in the circuit. hen the active the value on the signal controls the value written into the store, which is reflected in the value on Q. It may also be noted that the signal combination, R=1 & S=1, that was remarked as being not useful in using the RS-flip-flop as a memory, is not generated by this circuit. signal is This circuit is called a transparent -type flip-flop. -type reflects the fact that it has a input on which data is entered; transparent reflects that when the signal is active any change on immediately changes the stored value and the output value Q, i.e. data passes straight through. It is important to note that the value stored in the device, when is inactive, is the value on at the moment that goes from active to inactive: a 0 to 1 transition. This circuit is more recognisably a memory than the RS flip-flop, as the data and control signals are readily identifiable. To the right is shown the symbol that is used in later figures to represent a transparent -type flip-flop. Register and Memories The long bar over the name of a control signal, as in, indicates that it is active low. Control signals are active low for a variety of electrical reasons that are beyond the scope of these notes. Large blocks of memory can easily be developed from transparent -type flip-flops. The circuit below left has 4 of them with separate date inputs, but a common signal. All 4 flip-flops are written at the same time, but each stores a different data value. Thus this circuit operates as one 4-bit memory. The figure to the right has again 4 flip-flops, but here each has a separate control line, and a common data input signal. These are 4 separately selectable 1-bit memories: a value is set on the data input and can be written into one of the memories by activating the appropriate control input.
2 A combination of the forms of the previous 2 figures is used in the next circuit to produce 2 memories each holding 2 bits of data. The upper pair form one 2-bit memory, the lower pair another. The flipflops of each 2-bit memory have separate data inputs, but a common write control input, so that both are written together. In this circuit data is provided on the data bus lines, 0 & 1, at the bottom of the figure. It is anticipated that there is some other device, e.g. a microprocessor CPU, that controls the setting of data on this bus and activates the control signals to write to one of the memories. The data bus lines each connect to one device of a pair, e.g. 0 to the upper flip-flop of each pair. A bus is a group of related signals that potentially connect to a large number of devices. A Small Memory Circuit riting to one of the memories requires a sequence of actions:- data is output on to the data bus lines by the writing device, the appropriate write control line is activated, allowing the data into the selected memory, after a short delay to allow the flip-flops to change their state to the new data values, the control line is de-activated completing the write, the data value on the bus can be changed at any time after this. This deals with writing data into either of the memories, but a memory is of no use unless the data can be read back. In this circuit, as is common in many memory circuits, data is read back over the same data lines, that are used to write data into the memories. This reduces the number of connections required, but demands that more that one device outputs its data on the same wire. utputing ata on to a Shared ire In the circuit above, it is not possible to just connect the Q outputs from the memories directly to the bus lines. Any single wire can only have one value placed on it at a time, which means that only one device can be outputing a value on the wire at any instance (an alternative way of saying this is that only one device can drive the wire at any time). ata corruption or even a short-circuit can occur otherwise: imagine if the Q outputs of the upper flip-flop of each pair were directly connected to 0 with one Q at 0V and the other at 5V, then there would be short circuit between the power supply terminals via the Q outputs and 0. Thus where several devices can drive a wire, i.e. the wire is shared by the outputs, a switch is needed between each output and the wire, so that only one output is connected to the wire at a time while the others are disconnected. bviously, such switches need to be electrically controlled to make and break the connections, and a transistor would make an appropriate switch. However, it is usual for various electrical reasons to use a tri-state driver instead of just a single transistor as the switching device. A tri-state driver acts as a switch, but with a uni-directional information flow, unlike a transistor, which is bi-directional.
3 Tri-State river The tri-state driver is represented by the symbol shown. The connection on the left is the input and that on the write is the output. The connection at the bottom is the control (or enable) input that determines whether the output is driven with the input value or not. The circle on this connection indicates that this control is active low, i.e. that the tri-state driver is enabled with 0V on this input. The truth table for the device is shown on the right. The first 2 rows show the output following the input when the driver is enabled. The last 2 rows show that with the driver disabled, the output is not controlled by the driver; whatever other circuitry is connected to this output has to be examined to determine its voltage on the output. This last output state is the third state of the device, which is why it is called a tri-state driver. Reading from the Memory In the 2-bit memory, there is a tri-state driver on the output of each flip-flop, with a common control for the drivers for each pair, so that by activating the control the data from a pair is placed on the data bus wires. It is to be assumed that there are similar drivers on the output of whatever device drives the data bus during writes to the memory; it is this device that controls the reading process and the activation of the read control signals. The actions required to read the memory are:- the controlling device disconnects its own outputs from the data bus, so that the data bus is not driven by anything, one of the read control lines is activated and the Q outputs of the selected flip-flops drive the data bus, the controlling device reads in the data values from the data bus lines (basically it writes the data into a similar memory within itself), the read control is de-activated so that the tri-states are not enabled and data bus wires are no longer driven. ne transfer has occurred on the data bus, the data bus is free for a further read or write transfer. Larger Memories & the need for Addresses Control Input utput Comment switch closed switch closed 1 0? Voltage on output not 1 1? controlled by device The memory on the previous page can be regarded as a 2x2 (2-time-2) memory, as it has 2 memories (or 2 memory locations) with 2 bits of storage per memory or per memory location. It is obvious that this memory can be extended in 2 ways: by increasing the number of memories and by increasing the number of bits stored in each memory. There is a limit to the number of bits required per memory but the number of memories can get very large. For example, it is trivial to modify our memory to have 8 bits per memory, so that each memory can hold a byte of data, as the number of data lines to the memory only grows to 8, regardless of the number of memories. Extending the number of memories by a small amount, e.g. to 16 or 32 memories, and the number of bits per memory to the same sort of number is quite feasible and such a memory could be used within a CPU to provide it with a set of registers. However, extending the number of memories to produce a memory external to the CPU with anything near the number of memory locations available in commercially available devices, e.g. 256K locations (where 1K = I binary K = 2 10 ), increases the number of read and write control signals to unmanageable numbers. [I have switched to using the term locations instead of memories has in most computer systems the number of available locations external to the processor is usually greater that the number of memories installed, i.e. not all the available memory locations have memory installed. However, the processor has to have the potential to access all the available locations in the event that there were all occupied by installed memory.]
4 The solution is to generate selection signals locally to the memory, i.e. within the memory chip alongside the flip-flops, from a smaller number of signals sent from the controlling device. To do this each memory location within a memory device is assigned an address, starting from 0, and to access the memory only the address of the memory to access and the action to be performed is sent to the memory device. The action itself can be specified by having 2 new lines: one to specify that a read is required and one to specify a write is required. [There are other alternatives to this, but 2 wires are always needed to cover the 3 possible activities: a read transfer, a write transfer, and no transfer.] The number of address wires required is equal to the number of bits in the largest address and this is much smaller than the number of memory locations: it is log 2 m, where m is the number of locations. For a memory with 256K memory locations (256 x 2 10 = 2 18 locations) the number of address lines required to identify any location is just 18. It can be seen that the number of address lines grow very slowly compared to the number of memory locations: increasing the number of locations by a factor of 2 requires 1 extra address line, increasing by a factor of 1024 (2 10 ) requires only 10 extra address lines. For the 2x2 memory example, the 2 memories can be given addresses, 0 and 1, and only one address wire is needed to select between them. The generation of the necessary control signals at the memory is shown in the following circuit, which is an address decoding circuit. It decodes the address to generate individual control signals to the different memory locations. hen REA is active(0), RE is active(0) if ARESS RE ARESS is 0, and RE is active, if ARESS is 1. The write control signals are similarly activated. 0 nce an address is used to specify a location, it becomes impossible to access more than one location at a time, thus the statement earlier that only one location is written at a time. Memory Configuration REA RE 1 The organisation of a memory device, the number of memory locations, m, and the number of bits, n, per location that it has, is called its configuration, and there is recognised way of writing it down, m x n. Thus, the 2-bit memory has a configuration of 2x2: 2 locations with 2 bits per memory. For memory chips the number of locations is always a power of 2 and the number of bits per location is 1, 4, 8, or occasionally 16. Configurations such as 16Kx8, 256Kx8, 4Mx1 are available.
5 A 16x1 Memory A slightly larger memory circuit is shown in the next figure. Its configuration is 16xl: 16 locations with one bit per location. The basic 1-bit memory cell encapsulates a transparent -type flip-flop and a tri-state driver. This cell will perform a transfer only if both its and inputs are active (0), when a write will occur if is active, or data output on if is active: this last signal activates the tri-state buffer driving the output. The address decoding logic is shown along the top and left. The 4 address lines enable the selection of any of the 16 (2 4 ) locations. The address is divided in 2 with the top 2 lines, & A3, of the address identifying a row, and the lower 2 lines, Al &, identifying a column: the device at the intersection of the selected row and column is the one to be accessed, only this device has both its row and column selectors active. The R-gates along the top and left check the address lines and enable the appropriate row and column lines: only one of each is active at any time. The ATA input at top right is bi-directional, i.e. data is input to the memory on it during a write, and is output on it during a read. The Read and rite control signals indicate the action to be performed. 16x1 Memory ATA REA A3 M0 M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 The organisation of this memory as a matrix of rows and columns has been chosen to reflect the organisation found in commercially available memory chips. oing this reduces the size of the address decoding logic. Memories in Parallel Having a memory with 1-bit stored at each location might seem a little odd since in computer systems it is usual to store 8, 16 and 32 bit data values. However, memories with more bits/location (greater width) can be built by using smaller memories in parallel. The next figure shows how 8 of the 16xl memories of the previous page can be put together in parallel to form a 16x8 memory: 16 locations with 8 bits stored at each location. To keep the diagram manageable 4 of the memories are not shown.
6 The address lines and control lines all go to each 16xl memory, so that all 8 memories perform the same operation in parallel accessing the same internal address with the only difference in the circuit being that each 16xl memory is attached to a different data bus line. Thus during a transfer one location in each sub-memory is accessed to read or write 1 bit in each 16x1 memory, and in total a byte of data is either written in or read out of the 8 sub-memories. This is how large memories are built in computer systems, e.g. IBM PCs and Sun workstations, but using larger chips such as 4Mx4: 8 of these in parallel form a 4Mx32 memory. A3 REA 7 A3 REA 6 4 more memories in here connected to 2, 3, 4 & 5 A3 REA 1 A3 REA REA 16x8 Memory 0 A Circuit with Commercial Memories The final memory circuit shows 4 commercially available memories, Intel 2128 static RAM devices with configuration 2Kx8, arranged to provided 8 Kbytes of memory. The circuit is designed for a CPU which can address a maximum of 64 Kbytes of memory and that can transfer 8 bits in a single transfer with memory. [The memories are quite small and have been available for about 28 years, while the CPU has a small address and data capability, but the circuit is real for all that.]. In this circuit on any access at most one of these 4 memory devices should respond by reading or writing a byte of data: if none respond the access is to some other circuit in the computer: this circuit is only a portion of the system.
7 Since the address space of the CPU is 64K addresses, the Address bus has 16 lines (2 16 = 64K) named,,... 4, 5, and since the CPU can only transfer 1 byte in an access, the ata bus has only 8 lines, 0, , 7. Each memory is attached to all 8 lines of the ata bus, since each holds 8 bits within each location, and is attached to the read and write control lines from the CPU. Each memory device is also are connected to the least significant 11 lines (-0) of the address bus: the signals on these last lines select one location of the 2K locations (2K = 2 11 ) within each device. The Chip Enable input at the top of each device is the means by which one of the devices can be selected to respond to an access. This is an active low input, which when activated enables the device to perform a transfer. bviously the activation of a particular chip is driven by the address output with the access, and the Chip Enables are driven by address decoding logic. To make full use of the addressing range of the CPU, its address space, a unique 16-bit address must be allocated to each memory location within each chip. Since within a chip each location has already been assigned an 11-bit address, so that the internal hardware can select it, it is necessary to assign a further 5 bits to give each a 16-bit address. bviously the internal operation of the chip cannot be changed but an extra 5 bits can be assigned to a chip as a whole, so that all locations within the device get the same extra 5 bits. By assigning different 5 bit patterns to each chip, unique 16-bit addresses are given to every memory location, made up of the 5-bit chip address and the 11-bit internal assignment. The address decoding logic identifies the chip to select from the bit pattern on the upper 5 address lines (5-1).
8 Activation of one of the memories occurs when the pattern on the 5 top address lines (1-5, the lines that don t go to the memory address inputs) matches the 5 bits assigned to that memory. The matching process and subsequent memory activation is controlled by the address decoding logic at the top of the figure. If the decoding logic is analysed it can be seen that the 5 bit patterns assigned to the 4 memories are (A , B , C , ): remember that is a 0 on the Chip Enable input that activates a memory device. Thus, the lowest and highest assigned addresses in each memory are:- Memory 5 Bits assigned Lowest Address ( in binary) Highest Address ( in binary) Lowest Address ( in hex) Highest Address ( in hex) A FF B FFF C FF FFF Thus the 4 memories occupy the CPU address range from to 1FFF 16 : 8 K locations the same as the total number of locations in all 4 memories and 1/8 of the CPU Address Space. The remaining 7/8 of the address space can be used to address other devices, perhaps some RM or some dynamic RAM memory or some I/ devices (see later). ther Memory Types Read-nly-Memory is memory which keeps its contents when the power is turned off and usually holds a start-up program to be run when the computer is turned on, a bootstrap program, e.g. the BIS RM in PCs. Some RM types (electrically erasable and programmable ones) can have their contents written, albeit slowly, while within a computer system, using special sequences, (some BIS RMS) while other types have to be removed from the computer systems to be written. RAMs are writeable memories (they should be called Read-rite Memory(RM)), which have the same read and write access times (unlike writeable RMS, which are fast to read but slower to write). RAM stands for Random Access Memory, which is what read-write memory is usually called, despite the fact that RM can also be randomly accessed - any location can be accessed in any order. RAMs come in 2 forms: static and dynamic. Static memory is the memory form examined above and the basic 1-bit store is usually based upon a pair of inverters with feedback to store a data bit as earlier. ynamic memory uses a completely different method to store a bit: it stores a small amount of electrical charge in each 1-bit store. The presence or absence of the charge is used to determine the state of the memory, 0 or 1. Since charge tends to leak away over a short period, these memories have to be periodically read and re-written to keep their data. Measuring charge is difficult and takes some time. Thus, these devices are more complicated and slower (access time 70 µsec) than static memory (access time 10 µsec), although their organisation in a matrix of rows and columns is similar to the devices examined here. Their big asset is that the only one transistor is needed in each memory location to store 1-bit rather than the 6 transistors used in a static memory. Thus for the same number of transistors, 6 times more data can be stores in a dynamic memory. Thus dynamic memories are preferred for large memory systems, e.g. main memory in personal computers and workstations, but not for small or fast memory systems, e.g. cache memory. Both static and dynamic memories lose their data when power is removed. Some systems use CMS static RAM and keep it supplied with power even when the rest of the system is shut down, which maintains the contents of the RAM. This uses little power, since when not being switched CMS logic uses only a very small amount of power, which can be provided by a very small battery. This is also done in small electronic diaries. Similar systems are used in most machines to keep the clock running and up-to-date while systems are powered off.
9 Machine Cycles hen a computer is executing a program, the CPU actions can be divided into 2 types: internal operations such as incrementing a register, adding 2 values together, and external operations, which consist solely of operations to read memory (fetching an instruction or reading a data value) or to write memory (writing a data value). A single transfer with memory, as presented earlier, requires an address and a control signal to determine the operation, and during the transfer data will be passed between CPU and memory. Such a transfer is called a Machine Cycle, and this name may be qualified by the transfer type, e.g. Read Machine Cycle, rite Machine Cycle. There is a lower limit on the time to perform a machine cycle, and the sequence and relative timing of signal changes within a cycle is defined in the data sheet for the CPU, which is provided by the manufacturer. This sequencing and timing information for a machine cycle is usually defined in a timing diagram for the cycle. This shows the signals involved in the cycle, their logical values, and the timing relative to the start of the cycle. A simplified timing diagram for a rite Machine Cycle is shown in the next figure. The top signal in the diagram is the CPU s clock input signal. This signal is usually a square wave (the low and high levels have equal lengths in time) and it is this signal, which drives the CPU. The CPU has no idea of time; something is required to make it change from its current operation to the next one: from its current internal state to the next. This is the Clock s function, and it is the rising and falling clock edges (the transitions from 0 1 and 1 0) that drive the change. ith no sense of the passage of time, some signal change is required to The Clock Signal The period of the clock is the time to perform one complete cycle: the time from the start of a low period through the 0 1 transition and the high period, to the end of the 1 0 transition. The inverse of the clock period is the clock frequency. A clock with a frequency of 1 GHz has a clock period of 1 ns (1 ns = 10-9 seconds). This is a typical clock frequency for some modern CPU, although there are several with higher and some with lower.
10 cause the CPU to move to its next action. All of the CPU s activities are controlled by the clock, so that the initiation of signal changes in a machine cycle is done by the clock edges, while the minimum length of a cycle is controlled by the clock period: see box. This is why the clock is shown in the timing diagram: all changes on signals output from the CPU are caused by a clock edge; all input signals are examined on a clock edge. Thus in the diagram the following can be seen:- the first rising edge in the machine cycle causes the output on the Address bus of the address of the memory to be written, the second rising edge causes the data to be written to be output on the ata bus, the second falling edge in the cycle sets the signal active, the third falling edge causes the control signal to be de-activated, ending the write operation, the fourth rising edge is the start of the next machine cycle: the values on the Address and ata buses do not change again before this point in the cycle. Minor Notes on the Example Timing iagram There are some points to note about the cycle:- There is a delay from a clock edge until a signal changes: nothing changes infinitely quickly, and the clock has to go into the CPU chip, cause a change, and the new signal values have to make their way off the chip, The address and data are shown with both high and low levels at the same time; this reflects that both 0s and 1s are equally valid for these signals. The key points are where these signal change. The time to access memory is usually the dominant factor in the time to execute an instruction, since operations internal to a CPU are usually much faster than operations external to the CPU that require signals to leave the CPU chip. Timing iagrams for Real CPUs Timing diagrams for real CPUs are similar to the example, but differ in the exact timing and sequencing, and also in the signals active in the cycle. There is also usually a mechanism, not shown here, by which a memory can send a signal to the CPU to lengthen a machine cycle to give the memory more time to respond. This capability makes it more difficult to judge how long a CPU takes to read or write a value from memory. Clock Frequency and Computer System Performance The clock frequency gives an idea of the speed of operation of the CPU. However, it is difficult to judge relative performance against different manufacturer's CPU from the clock frequency, since internal architectures differ, which affects the response to the clock edges. It is also difficult to judge a system's performance from its CPU's clock frequency, since this depends on other factors besides the CPU, e.g. the memory size and speed, presence of cache, hard disk performance.