Pausible Clocking Based Heterogeneous Systems

Size: px
Start display at page:

Download "Pausible Clocking Based Heterogeneous Systems"

Transcription

1 Pausible Clocking Based Heterogeneous Systems Kenneth Y. Yun, Member, IEEE Ayoob E. Dooply, Student Member, IEEE Abstract This paper describes a novel communication scheme, which is guaranteed to be free of synchronization failures, amongst multiple synchronous and asynchronous modules operating independently. In this scheme, communication between every pair of modules is done through an asynchronous channel; communication between a module and the is done using a requestacknowledge handshaking. hronization of handshake signals to the local module clock is done in an unconventional way [1], [2], [3], [] the local clock built out of a ring oscillator is paused or stretched,if necessary, to ensure that the handshake signal satisfies setup and hold time constraints with respect to the local clock. In order to validate this scheme, we implemented a test chip in 0.5µm CMOS. This chip is designed as a ring, composed of two synchronous modules, an asynchronous module, and two asynchronous s. Each module functions as a receiver to one module and a sender to another module. Test results show that the chip functions reliably up to 56MHz. Keywords lobally asynchronous locally synchronous, Stretchable clock, hronization, Heterogeneous systems I. INTODUCTION The next generation digital VLSI systems will necessarily be based on system-on-a-chip concepts, in order to satisfy unrelenting demands for higher performance and also to accommodate smaller packaging and low power requirements. These on-chip systems will consist of multiple independently synchronized modules: some may be clocked modules, such as synchronous processor cores, while others may be clockless (asynchronous) modules, such as peripheral controllers. These chip designs will resemble today s complex board-level designs. On-chip modules will be held together by interface glue logic which must facilitate high speed communication between synchronous modules operating at different clock frequencies, between synchronous and asynchronous modules, and between asynchronous modules. The key difference between today s board-level designs and the future system-on-a-chip designs, however, is the speed at which communications take place. A first step toward such heterogeneous system design on a chip is a reliable high-speed communication scheme among multiple synchronous modules operating independently. We examined a variety of communication schemes that attempt to mitigate synchronization failure without sacrificing communication throughput. They generally fall into one of two categories: (1) brute-force synchronization of communication signals to each module s free-running clock with an acceptable level of synchronization failure; (2) adjustment of individual synchronous module s local clock, when necessary, to avoid synchronization failure. The first category includes methods such as the well-known double-latching scheme and a natural extension of the doublelatching scheme called pipeline synchronization [5]. These methods reduce the probability of synchronization failure to This work was supported in part by a gift from Intel Corporation and by a National Science Foundation CAEE Award MIP The authors are with ECE Dept., University of California, San Diego, 9500 ilman Drive, La Jolla, CA ; {kyy,adooply}@ucsd.edu. an acceptable level by resynchronizing communication signals with back-to-back latches. These methods are simple and inexpensive to implement, but a major drawback is the latency of communication. The methods in the second category [1], [2], [3], [] generally rely on stopping or stretching each synchronous module s local clock to guarantee that communication signals never violate setup and hold time constraints with respect to the local clock. Although these methods are robust and do not incur long communication latency, they involve designing a special clocking circuit, unfamiliar to most designers. The simplest example using this scheme one can conceive is a synchronous module communicating with an asynchronous peripheral. In this system, the synchronous module latches the handshake signals from the asynchronous module by stopping or stretching its own clock, when necessary. Both categories described above require some form of arbitration. However, if the phase relation between two synchronous modules can be predicted deterministically as in [6], then we can avoid synchronization failure, without arbitration, by timing communications when safe. In this paper, we describe a general method of communication amongst multiple synchronous and asynchronous modules operating independently, i.e., at different clock frequencies or phases, based on the pausible clocking scheme as shown in Fig. 1. hronous modules communicate with one another via asynchronous s 1 used as communication channels. The interfaces between the synchronous modules and the are pausible clocking control (PCC) circuits. In order to validate this scheme, we implemented a test chip in 0.5µm CMOS through MOSIS. This chip is designed as a ring, composed of two synchronous modules, an asynchronous module, and two asynchronous s. Each module functions as a receiver to one module and a sender to another module. Each synchronous module has a frequency control with four settings, which is independently programmable externally. The chip functions reliably for all the settings, albeit substantially faster than expected by simulation. II. DESIN OF PAUSIBLE CLOCKIN CONTOL The pausible clocking control is a scheme to avoid synchronization failure by adjusting the local clock. A synchronization failure at the module interface occurs when the arrival times of an external signal transition and a sampling edge of the clock are indistinguishable by the sampling latch at the module boundary. In our scheme, the synchronization failure is circumvented by pausing or stretching the local module clock when necessary. 1 Self-timed s have been used to facilitate robust communication between synchronous modules operating at the same frequency elsewhere [7]. However, they have not been utilized in communication between synchronous modules operating independently at different clock frequencies.

2 IEEE TANSACTIONS ON VLSI SYSTEMS 2 sender receiver h Module 1 PCC 1ρ A1ρ A1σ 1σ 2σ A2σ A2ρ 2ρ sender receiver PCC h Module 2 Fig. 1. Two synchronous modules communicating via asynchronous channels. ρ Aσ 1 Aρ FSM S ρ 2 σ FSM SA σ 2 Arbiter A. hronization Strategy ME rclk Fig. 2. PCC. ing Oscillator A block diagram of the PCC is shown in Fig. 2. This scheme uses a mutual exclusion element (ME) to force the temporal separation of the sampling edges of the clock and external signal transitions. A mutual exclusion element [2], [8] is a circuit that allows one request to pass through at a time on a first come first serve basis. When two inputs arrive simultaneously, it selects one to pass through arbitrarily. Because MEs require that requesters competing for shared resources must be persistent, the clock input to the ME must be stretched when it loses an arbitration. A ring oscillator is used instead of a crystal oscillator in order to be able to adjust the duration of off-phases of the clock. The local module clock,, is a buffered version of one of the outputs of the ME. It normally has a 50% duty cycle, except when the clock input loses an arbitration, in which case an off-phase of the clock is stretched. Consider a scenario in which a synchronous module equipped with the PCC is receiving inputs from the. The PCC needs ρ 1 1 S ρ A ρ rclk A σ 2 setup paused Fig. 3. PCC timing (bidirectional communication). an arbiter, as depicted in Fig. 2, to decide which of the two independent inputs (the request from the sending [ ρ ]andthe acknowledgment from the receiving [A σ ]) to be presented to the ME. For this scenario, we assume that ρ has won the arbitration. Thus, in this case, the arbiter simply forwards 1 to the ME and from the ME. As shown in the shaded area of Fig. 3, ρ is forwarded to the ME via the asynchronous finite state machine () and the arbiter. If rclk is low when rises, then the ME immediately raises, which prompts the to generate an event on S ρ.this event is effectively synchronized to, i.e., guaranteed not to induce a synchronization failure when sampled by the FSM, under a reasonable timing assumption as described below. Note that rclk may rise before the ME lowers, but the ME will not allow to rise until becomes low. On the other hand, if rclk is already high when rises, the assertion of is stalled until rclk is lowered. As soon as rclk falls, the ME raises and the generates an event on S ρ. clk may actually rise at about the same time rises. In such situations the situations in which temporal separation of + and rclk + becomes blurred the ME simply tosses a coin to determine which signal to service first. If rclk + wins the coin toss, then rises first and remains low until falls (which happensshortlyafter rclk falls). On the other hand, if + wins, then the ME raises first and blocks from rising. In orderto prevent from stalling indefinitely (until the next toggling of the request, ρ ), the lowers 1 immediately after 1 rises, which in turn causes to fall allowing to rise. The PCC does not differentiate rising edges of ρ from falling edges both edges enable 1 to be asserted and 1 to be asserted as a result. In fact, the effectively performs a two-phase to four-phase conversion from ρ to 1 and a four to two-phase conversion from 1 to S ρ. This conversion is independent of whether a two-phase or four-phase communication protocol is used between the and the synchronous module. It is merely done so that both edges of S ρ are synchronized to. In order for the synchronous FSM that generates A ρ to recognize the change in S ρ, we need to ensure that S ρ satisfies setup and hold time constraints with respect to. Inor- der to recognize S ρ + (S ρ ) 2 reliably, the path from 1 + to S ρ + (S ρ ) must be shorter than the path from 1 + to + via and by at least the data setup time for the FSM latches. This is easily satisfied in practice because 1 + to S ρ + (S ρ ) delay is a complex gate delay (transitions on S ρ are directly triggered by 1 +),whichismuchlessthanthe delay from 1 + to to to +. The asynchronous finite state machine (see Fig. ) is specified in burst-mode [9] and synthesized using the 3D-gC synthesis tool [10]. This burst-mode state machine has two inputs ( ρ, 1 ) and two outputs ( 1, S ρ ). In state 0, when ρ rises, the machine raises 1 and goes to state 1. In state 1, the machines waits for 1 to rise; when it does, the machine lowers 1 and raises S ρ concurrently and goes to state 2. When 1 falls in state 2, the machine transitions to state 3. The machine transi- 2 We use a+ and a to denote rising and falling transitions of a respectively.

3 IEEE TANSACTIONS ON VLSI SYSTEMS ρ S ρ + ρ 1 + ρ Sρ Sρ weak reset gme (a) Arbiter with fast forward request S ρ weak S ρ 1 gme T1 2 Fig.. PCC asynchronous finite state machine specification and implementation. tions through states and 5 and back to 0, as ρ triggers a sequence of signal transitions ending with 1. B. Arbitration In order to simplify the interface to the ME, our design uses an arbiter to select just one external signal to pass through at a time. An arbiter [8], [11], [12] is a circuit that propagates one request at a time (as does the ME) but also acknowledges the requesters with grant signals as well. The arbiter used in our design [13] shown in Fig. 5 is different from conventional ones. When 1 or 2 arrives at the arbiter, the arbiter forwards a request to the ME, without determining which one has arrived first. As shown in Fig. 3, 1 and 2 (enabled by ρ and A σ respectively) may arrive at the same time, causing gme (gated ME inside the arbiter) to go into a metastable state. When + arrives at the ME, rclk + may also arrive at about the same time as well. If the ME determines that + has arrived first (after resolving its internal metastability), it raises and blocks from rising (thus stretching it) until falls. Note that the metastability resolution in the gme would be wellunderway, ifnotalreadyfinished, bythetime is asserted. Thus the metastability resolution in the gme and the ME are done concurrently. When is asserted, 1 rises, which in turn resets and 1. enables, which in turn enables to rise. The pending request 2 enables to be asserted again after 1 resets. However, it is blocked until rclk falls. Note that our arbiter design is not speed-independent, i.e., it requires a timing constraint on its environment for correct functionality. Consider an example depicted in Fig. 5(c), in which 2 + starts a chain of events, starting with +. After is asserted, 2 is raised. This event enables both and 2.If 2 side is slow to respond, i.e., 2 does not fall till is negated, then this residual 2 may be viewed as a new request to the gme. In our design, the ASFM (see Fig. ) lowers 2 immediately, once 2 rises, in order to prevent this problem. More precisely, the following timing constraint is always satisfied for both i =1and 2: t + i + t + + t + i >t +. (1) i i 1 T1 2 T (b) ated ME T2 (c) Arbiter timing Fig. 5. (a) Fast-forward arbiter; (b) ated ME (ME with enabled outputs); (c) Fast-forward arbiter timing. III. SYSTEM CONFIUATIONS AND LIMITATIONS Using the pausible clocking scheme, it is conceivable that one can construct a heterogeneous multi-processor system with point-to-point links between every pair of nodes. Each link is a bidirectional as shown in Fig. 1. However, as fanouts from and fanins to each node increase, the arbiter block becomes larger making the system design impractical. However, we assert that it is possible to construct a ring configuration as shown in Fig. 6(a) similar to the systems proposed in Scalable Coherent Interface (SCI) specification [1]. In this structure, messages are always transmitted to one side and received from the other side, so that only one level of arbitration is required in the PCC. A major advantage of our ring configuration over other proposed systems, such as SCI system, is that it is a truly heterogeneous system with each node operating at its own speed. Another possible system configuration would be a conventional bus based architecture depicted in Fig. 6(b). In this config-

4 IEEE TANSACTIONS ON VLSI SYSTEMS ρ Aρ FSM (1) 1 ME ing Oscillator Async (a) Heterogeneous ring configuration (3) S ρ Aσ σ FSM SA σ 2 2 (2) Arbiter rclk up Fig. 7. Performance constraints. Async Bus Arbiter Async (b) Bus configuration Asynchronous bus Fig. 6. (a) A heterogeneous message-passing multiprocessor using PCC s; (b) A conventional bus based system. uration, synchronous modules are connected to an asynchronous bus, e.g., Futurebus+, via s and asynchronous modules are connected directly to the bus. Since the synchronous modules in this configuration interface to the bus only via two s, the PCC s in the synchronous modules require just one level of arbitration. Systems-on-a-chip should be designed with as many reusable components as possible. Standard modules, such as CPU cores, should be reused with little or no modification, because these modules are highly optimized for performance and sensitive to timing variation. For the systems proposed in this paper, ideally, the pausible clocking control circuit should simply replace a portion of the system clock generation unit. However, for state-of-the-art microprocessors, the system clock is produced by a phase locked loop (PLL). We cannot adjust the phase of the output of a PLL instantaneously in an analog fashion, as required in our pausible clocking control. Thus a ring oscillator should be used in place of a PLL. We then lose control of the nominal frequency. Tuning the ring oscillator frequency would require more control pins and hence is more expensive. Furthermore, the ring oscillator based clocks may be more susceptible to jitter problems. However, the ring oscillator does have an advantage that its frequency drift closely tracks logic components on the chip, e.g., if logic components slow down due to an increase in operating temperature, then so does the ring oscillator. IV. PEFOMANCE CONSIDEATIONS In this section, we examine performance limitations of our PCC scheme. The maximum clock frequency of the PCC is constrained by the FSM input setup and hold time requirements as follows (see Fig. 7). Assume that is high (enabled by ρ )whenrclk falls. This activates + and 1 + in sequence, which in turn enables S ρ to toggle. We need to in- sure that S ρ satisfies setup and hold time requirements with respect to +. Assuming that the delays through paths (1) and (2) are t 1 = t rclk + t t 1 + and 1 S+ ρ t 2 = t rclk + + respectively, the following inequality must hold true: T 2 + t h <t 1 t 2 < T 2 t su (2) where t su and t h are the required setup and hold times of S ρ with respect to + and T is the nominal period of the ring oscillator clock with a 50% duty cycle. Therefore, the minimum clock period achievable is: T min =2max(t 1 t 2 + t su,t 2 t 1 + t h ). (3) In general, t 2 >> t 1 for a large clock tree; hence T min =2(t 2 t 1 + t h ). In other words, the minimum clock period is about twice the clock tree delay. Assuming that s + h represents the metastability window of the ME (analogous to the setup and hold time window of a flip-flop) and represents the delay from the time is granted an access to the ME to the time it relinquishes it, we can define window of vulnerability to be + h (if > s ). In other words, may pause (due to A + σ enabled by + σ )ifthe following condition is true: nt < t + + σ + t + σ A + σ + t A + σ + 2 +t rclk t < nt + h for n 1, () where =t t t 1 +. A similar inequality exists for ρ (the same as Eq., but with σ replaced by A ρ, 1 A σ by ρ,and 2 by 1 ). Clearly, the frequency and duration of clock stretching depends on the size of this window ( + h ) the smaller this window, the higher the performance of the system. V. EXPEIMENTAL ESULTS We constructed a heterogeneous ring on a chip as depicted in Fig. 8. The chip consists of two synchronous modules nicknamed CPU (CPU) and Co-processor (Coproc), an asynchronous module (Async), and two asynchronous s (1 and 2). 3 Every module has two neighbors. Messages are always transmitted to one side and received from the 3 Our implementation is a fully-decoupled described in [15].

5 IEEE TANSACTIONS ON VLSI SYSTEMS 5 in A in D in ρ A CPU PCC S ρ SA σ 1 2 Coproc FSM A 1 A 2 FF (a) A σ 2 A A3 1 Async A1 1 Fig. 8. Test chip configuration. A out out D out 2 A 1 =00 2 A 1 =01 2 Coproc A2 2 1 =0 1 A 2 =01 A 2 =0 1 A 2 =11 A 2 =1 1 A 2 =10 A 2 =0 (b) 1 =0 2 A 1 =11 2 A 1 =01 1 A 2 =11 1 =1 Fig. 9. (a) Co-processor implementation; (b) Co-processor FSM specification. Coproc CLK(162MHz) 2 A2 Coproc Dout[1] Coproc Dout[0] 3 A3 Async Dout[1] Async Dout[0] A 120ns 10ns 160ns 180ns 200ns 220ns 20ns 260ns Fig. 10. Simulation trace (f Coproc = 162 MHz, f CPU = 327 MHz). Fig. 11. Die photo. other side. All communications are done using a four-phase return-to-zero protocol. CPU and Coproc are equipped with a PCC whose ring oscillator frequency can be set to one of different frequencies ranging from 162MHz to 327MHz. CPU generates data and sends them to Coproc via 1. It has an FSM that generates 2-bit data in ray code sequence. Coproc generates -bit output by transforming the 2-bit input from CPU and passes them to Async. The Coproc FSM, depicted in Fig. 9(b), handles the flow control of data in a semidecoupled manner: i.e., it asserts an acknowledgment and a request to its input and output s, upon receiving an input request and latching input data, but does not complete handshaking on the input side until its output acknowledges the receipt of data. Async is composed of an and a data processing unit. This controls the flow of data in a similar fashion as the FSM of Coproc but asynchronously. The data processing unit extracts two copies of the original 2-bit data and passes them back to CPU via 2. 1 and 2 (both with depth of two) are included for a generic reason of smoothening the bursty data transfer between two modules operating at different clock rates and decoupling the operations of the modules. We performed extensive SPICE simulations of the chip after backannotating the layout parasitics into the schematic, using Mentor raphics Accusim. The timing trace in Fig. 10 shows a simulation result including handshake and data signals. CPU and Coproc are programmed to run at 327MHz and 162MHz respectively in this case. The frequencies of 2 and A 2 (handshake signals between Coproc and Async) are one fourth that of Coproc clock, because it takes one clock cycle to sample A 2 and another clock cycle to toggle 2. 2 and A 2 have about 50% duty cycle because they are synchronized to Coproc clock; however, 3 and A 3 have much narrower pulses because 2 (output side of Async) is never full and thus responds quickly. The chip (die photo shown in Fig. 11) was fabricated in 0.5µm HP CMOS1TB triple-metal process through MOSIS. Our test results show that the fabricated chips function correctly, but test results are found to be much faster than the simulated results (with a scale factor of 1.). We believe that this is due, in part, to the conservative estimation of parasitics by the Mentor tool. Test results clearly indicate that the clocks do become stretched. Although we were not able to observe high frequency clock signals (due to the pad design limitations), we could clearly observe several instances of clock pausing when the CPU module was programmed to operate at 162MHz (simulation speed). An oscilloscope trace is shown in Fig. 12. Signals labeled CLK and EQ are and ρ of the CPU module observed at the pins of the fabricated chip. CLK is paused if EQ toggles within the window of vulnerability of CLK+,whichwas calculated to be between 3.3ns and 2.3ns from the rising edge of CLK (with a scale factor of 1. taken into account). However, the time difference between CLK and EQ observed on the oscilloscope as shown in Fig. 12 is outside the calculated window. In fact, the instance captured in Fig. 12 shows that CLK pauses even when EQ rises 3.85ns before CLK rises. The source of this discrepancy is unclear. We conjecture that not all delays scale the same. Test measurements were made with HP 5720A Oscilloscope with two Sas plug-in cards. Table I compares the nominal frequencies (without clock We used pads readily available from MOSIS which could not support the high speed clock signals.

6 IEEE TANSACTIONS ON VLSI SYSTEMS 6 Test Simulation No. of Fraction of Operating Nominal Operating Handshake Input Clock Cycle Stretched (without pause) (with pause) Transitions Per HS Transition CPU Coproc CPU Coproc CPU Coproc Per 1000ns CPU Coproc (56MHz) (56MHz) 327MHz 327MHz 290MHz 290MHz (56MHz) 236MHz 327MHz 162MHz 290MHz 150MHz MHz (56MHz) 162MHz 327MHz 150MHz 290MHz MHz 236MHz 162MHz 162MHz 150MHz 150MHz TABLE I COMPAISON OF NOMINAL AND OPEATIN FEQUENCIES (, 27 C, N68MA UN). Fig. 12. Oscilloscope trace illustrating clock stretching ( frequency = 236 MHz tested at and 22 C). pausing) and the operating frequencies obtained from simulation and test. The number shown in parentheses (56MHz) was obtained by extrapolation, not from the actual test. This is due to the limitation imposed by IO pads. However, all the data and handshake signals, measurable because of their low frequencies, have been tested to be correct, even when the clock is set to the maximum frequency. Thus we concluded that the clock internal to the chip does not result in any synchronization failures. We extrapolated the top clock speed from the simulation and lowerspeed clock signals measured from the test. The impact of clock pausing on performance can be deduced from Table I. For example, when the nominal frequencies of CPU and Coproc are programmed to be 327MHz and 162MHz, the actual operating frequencies are 290MHz and 150MHz respectively (according to simulation). Therefore, CPU clock is stretched 37 (nominal) clock cycles per 1000ns and Coproc clock 12 cycles per 1000ns. However, clock stretching potentially occurs only when handshake inputs toggle. We measured 152 handshake input transitions per 1000ns for this case. Thus, in this case, the fraction of clock cycle stretched per handshake input transition for CPU is =0.2. As the frequency is lowered, the number of clock cycles stretched becomes smaller. This is because the window of vulnerability as a fraction of a clock period becomes smaller. Clearly, we are trading off system performance for communication reliability and latency. In fact, the degradation in CPU throughput is about 12% in this configuration. However, we assert that we can gain back this loss and possibly more by operating the system at a higher speed, i.e., by tuning the ring oscillator to a higher frequency. Systems with a ring oscillator clock do not require as much margin for the worst-case design as the ones with a fixed reference clock, because the operating condition changes affect both the logic and ring oscillators similarly. VI. CONCLUSION We presented a new communication strategy, which is based on the pausible clocking scheme, for multiple synchronous modules operating independently. We described the design and implementation of the pausible clock unit in detail. In order to prove its feasibility, we implemented a 0.5µm CMOS chip, mimicking a heterogeneous ring. Our test results show that the chip functions correctly as simulated, up to the top clock speed of 56MHz. We suggested possible system configurations (heterogeneous rings and bus based systems) and practical limitations. We also determined analytical performance bounds and analyzed the effects of clock stretching on overall system performance. Acknowledgment The authors would like to thank yan Donohue, eza Jalilizeinali, and LiJen Ko for their chip design effort. EFEENCES [1] M. J. Stucki and J.. Cox Jr., hronization strategies, in Proceedings of the First Caltech Conference on Very Large Scale Integration, C.L. Seitz, Ed., 1979, pp [2] C. L. Seitz, System timing, in Introduction to VLSI Systems, C. A. Mead and L. A. Conway, Eds., chapter 7. Addison-Wesley, [3] D.M.Chapiro, lobally-asynchronous Locally-hronous Systems, Ph.D. thesis, Stanford University, Oct [] F. U. osenberger, C. E. Molnar, T. J. Chaney, and T. Fang, Q-modules: Internally clocked delay-insensitive modules, IEEE Transactions on Computers, vol. C-37, no. 9, pp , Sept [5] J. N. Seizovic, Pipeline synchronization, in Proc. International Symposium on Advanced esearch in Asynchronous Circuits and Systems, Nov. 199, pp [6] L. F.. Sarmenta,. A. Pratt, and S. A. Ward, ational clocking, in Proc. International Conf. Computer Design (ICCD), Oct. 1995, pp [7] M.. reenstreet, Implementing a STAI chip, in Proc. International Conf. Computer Design (ICCD), 1995, pp [8] C. L. Seitz, Ideas about arbiters, Lambda, vol. 1, no. 1, First Quarter, pp. 10 1, [9] Steven M. Nowick, Automatic Synthesis of Burst-Mode Asynchronous Controllers, Ph.D. thesis, Stanford University, Department of Computer Science, [10] K. Y. Yun and D. L. Dill, Automatic synthesis of extended burst-mode circuits: part II (automatic synthesis), IEEE Transactions on Computer- Aided Design, vol. 18, no. 2, pp , Feb [11] A. J. Martin, On Seitz s arbiter, Tech. ep. 5212:T:86, Caltech Computer Science, [12] T. Sakurai, Optimization of CMOS arbiter and synchronizer circuits with submicron MOSFETs, IEEE Journal of Solid-State Circuits, vol. 23, no., pp , Aug [13] M. B. Josephs and J. T. Yantchev, CMOS design of the tree arbiter element, IEEE Transactions on VLSI Systems, vol., no., pp , Dec [1] IEEE Standard , Scalable coherent interface (SCI), [15] S. B. Furber and P. Day, Four-phase micropipeline latch control circuits, IEEE Transactions on VLSI Systems, vol., no. 2, pp , June 1996.

Pausible Clocking: A First Step Toward Heterogeneous Systems æ

Pausible Clocking: A First Step Toward Heterogeneous Systems æ Pausible Clocking: First Step Toward Heterogeneous Systems æ Kenneth Y. Yun yan P. Donohue Department of Electrical and Computer Engineering University of California, San Diego 9500 Gilman Drive La Jolla,

More information

Timing Methodologies (cont d) Registers. Typical timing specifications. Synchronous System Model. Short Paths. System Clock Frequency

Timing Methodologies (cont d) Registers. Typical timing specifications. Synchronous System Model. Short Paths. System Clock Frequency Registers Timing Methodologies (cont d) Sample data using clock Hold data between clock cycles Computation (and delay) occurs between registers efinition of terms setup time: minimum time before the clocking

More information

Demystifying Data-Driven and Pausible Clocking Schemes

Demystifying Data-Driven and Pausible Clocking Schemes Demystifying Data-Driven and Pausible Clocking Schemes Robert Mullins Computer Architecture Group Computer Laboratory, University of Cambridge ASYNC 2007, 13 th IEEE International Symposium on Asynchronous

More information

Clocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1

Clocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1 ing Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 s 1 Why s and Storage Elements? Inputs Combinational Logic Outputs Want to reuse combinational logic from cycle to cycle 6.884 - Spring 2005 2/18/05

More information

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Bradley R. Quinton Dept. of Electrical and Computer Engineering University of British Columbia bradq@ece.ubc.ca

More information

A Pausible Bisynchronous FIFO for GALS Systems

A Pausible Bisynchronous FIFO for GALS Systems 2015 Symposium on Asynchronous Circuits and Systems A Pausible Bisynchronous FIFO for GALS Systems Ben Keller*, Ma@hew FojCk*, Brucek Khailany* *NVIDIA CorporaCon University of California, Berkeley May

More information

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology Topics of Chapter 5 Sequential Machines Memory elements Memory elements. Basics of sequential machines. Clocking issues. Two-phase clocking. Testing of combinational (Chapter 4) and sequential (Chapter

More information

路 論 Chapter 15 System-Level Physical Design

路 論 Chapter 15 System-Level Physical Design Introduction to VLSI Circuits and Systems 路 論 Chapter 15 System-Level Physical Design Dept. of Electronic Engineering National Chin-Yi University of Technology Fall 2007 Outline Clocked Flip-flops CMOS

More information

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead Clock - key to synchronous systems Topic 7 Clocking Strategies in VLSI Systems Peter Cheung Department of Electrical & Electronic Engineering Imperial College London Clocks help the design of FSM where

More information

BURST-MODE communication relies on very fast acquisition

BURST-MODE communication relies on very fast acquisition IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 8, AUGUST 2005 437 Instantaneous Clockless Data Recovery and Demultiplexing Behnam Analui and Ali Hajimiri Abstract An alternative

More information

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana

More information

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems Harris Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems David Harris Harvey Mudd College David_Harris@hmc.edu Based on EE271 developed by Mark Horowitz, Stanford University MAH

More information

Power Reduction Techniques in the SoC Clock Network. Clock Power

Power Reduction Techniques in the SoC Clock Network. Clock Power Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a

More information

Lecture 11: Sequential Circuit Design

Lecture 11: Sequential Circuit Design Lecture 11: Sequential Circuit esign Outline Sequencing Sequencing Element esign Max and Min-elay Clock Skew Time Borrowing Two-Phase Clocking 2 Sequencing Combinational logic output depends on current

More information

PROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics

PROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics PROGETTO DI SISTEMI ELETTRONICI DIGITALI Digital Systems Design Digital Circuits Advanced Topics 1 Sequential circuit and metastability 2 Sequential circuit - FSM A Sequential circuit contains: Storage

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

Low latency synchronization through speculation

Low latency synchronization through speculation Low latency synchronization through speculation D.J.Kinniment, and A.V.Yakovlev School of Electrical and Electronic and Computer Engineering, University of Newcastle, NE1 7RU, UK {David.Kinniment,Alex.Yakovlev}@ncl.ac.uk

More information

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin BUS ARCHITECTURES Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin Keywords: Bus standards, PCI bus, ISA bus, Bus protocols, Serial Buses, USB, IEEE 1394

More information

Lecture 7: Clocking of VLSI Systems

Lecture 7: Clocking of VLSI Systems Lecture 7: Clocking of VLSI Systems MAH, AEN EE271 Lecture 7 1 Overview Reading Wolf 5.3 Two-Phase Clocking (good description) W&E 5.5.1, 5.5.2, 5.5.3, 5.5.4, 5.5.9, 5.5.10 - Clocking Note: The analysis

More information

Clock Distribution in RNS-based VLSI Systems

Clock Distribution in RNS-based VLSI Systems Clock Distribution in RNS-based VLSI Systems DANIEL GONZÁLEZ 1, ANTONIO GARCÍA 1, GRAHAM A. JULLIEN 2, JAVIER RAMÍREZ 1, LUIS PARRILLA 1 AND ANTONIO LLORIS 1 1 Dpto. Electrónica y Tecnología de Computadores

More information

PowerPC Microprocessor Clock Modes

PowerPC Microprocessor Clock Modes nc. Freescale Semiconductor AN1269 (Freescale Order Number) 1/96 Application Note PowerPC Microprocessor Clock Modes The PowerPC microprocessors offer customers numerous clocking options. An internal phase-lock

More information

Lecture 10: Sequential Circuits

Lecture 10: Sequential Circuits Introduction to CMOS VLSI esign Lecture 10: Sequential Circuits avid Harris Harvey Mudd College Spring 2004 Outline q Sequencing q Sequencing Element esign q Max and Min-elay q Clock Skew q Time Borrowing

More information

S. Venkatesh, Mrs. T. Gowri, Department of ECE, GIT, GITAM University, Vishakhapatnam, India

S. Venkatesh, Mrs. T. Gowri, Department of ECE, GIT, GITAM University, Vishakhapatnam, India Power reduction on clock-tree using Energy recovery and clock gating technique S. Venkatesh, Mrs. T. Gowri, Department of ECE, GIT, GITAM University, Vishakhapatnam, India Abstract Power consumption of

More information

DS2187 Receive Line Interface

DS2187 Receive Line Interface Receive Line Interface www.dalsemi.com FEATURES Line interface for T1 (1.544 MHz) and CEPT (2.048 MHz) primary rate networks Extracts clock and data from twisted pair or coax Meets requirements of PUB

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 16 Timing and Clock Issues

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 16 Timing and Clock Issues EE 459/500 HDL Based Digital Design with Programmable Logic Lecture 16 Timing and Clock Issues 1 Overview Sequential system timing requirements Impact of clock skew on timing Impact of clock jitter on

More information

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy Hardware Implementation of Improved Adaptive NoC Rer with Flit Flow History based Load Balancing Selection Strategy Parag Parandkar 1, Sumant Katiyal 2, Geetesh Kwatra 3 1,3 Research Scholar, School of

More information

Switched Interconnect for System-on-a-Chip Designs

Switched Interconnect for System-on-a-Chip Designs witched Interconnect for ystem-on-a-chip Designs Abstract Daniel iklund and Dake Liu Dept. of Physics and Measurement Technology Linköping University -581 83 Linköping {danwi,dake}@ifm.liu.se ith the increased

More information

A New Paradigm for Synchronous State Machine Design in Verilog

A New Paradigm for Synchronous State Machine Design in Verilog A New Paradigm for Synchronous State Machine Design in Verilog Randy Nuss Copyright 1999 Idea Consulting Introduction Synchronous State Machines are one of the most common building blocks in modern digital

More information

ISSCC 2003 / SESSION 13 / 40Gb/s COMMUNICATION ICS / PAPER 13.7

ISSCC 2003 / SESSION 13 / 40Gb/s COMMUNICATION ICS / PAPER 13.7 ISSCC 2003 / SESSION 13 / 40Gb/s COMMUNICATION ICS / PAPER 13.7 13.7 A 40Gb/s Clock and Data Recovery Circuit in 0.18µm CMOS Technology Jri Lee, Behzad Razavi University of California, Los Angeles, CA

More information

8 Gbps CMOS interface for parallel fiber-optic interconnects

8 Gbps CMOS interface for parallel fiber-optic interconnects 8 Gbps CMOS interface for parallel fiberoptic interconnects Barton Sano, Bindu Madhavan and A. F. J. Levi Department of Electrical Engineering University of Southern California Los Angeles, California

More information

Sequential Logic Design Principles.Latches and Flip-Flops

Sequential Logic Design Principles.Latches and Flip-Flops Sequential Logic Design Principles.Latches and Flip-Flops Doru Todinca Department of Computers Politehnica University of Timisoara Outline Introduction Bistable Elements Latches and Flip-Flops S-R Latch

More information

An On-chip Security Monitoring Solution For System Clock For Low Cost Devices

An On-chip Security Monitoring Solution For System Clock For Low Cost Devices An On-chip Security Monitoring Solution For System Clock For Low Cost Devices Frank Vater Innovations for High Performance Microelectronics Im Technologiepark 25 15236 Frankfurt (Oder), Germany vater@ihpmicroelectronics.com

More information

EE552. Advanced Logic Design and Switching Theory. Metastability. Ashirwad Bahukhandi. (Ashirwad Bahukhandi) bahukhan@usc.edu

EE552. Advanced Logic Design and Switching Theory. Metastability. Ashirwad Bahukhandi. (Ashirwad Bahukhandi) bahukhan@usc.edu EE552 Advanced Logic Design and Switching Theory Metastability by Ashirwad Bahukhandi (Ashirwad Bahukhandi) bahukhan@usc.edu This is an overview of what metastability is, ways of interpreting it, the issues

More information

Sequential Logic: Clocks, Registers, etc.

Sequential Logic: Clocks, Registers, etc. ENEE 245: igital Circuits & Systems Lab Lab 2 : Clocks, Registers, etc. ENEE 245: igital Circuits and Systems Laboratory Lab 2 Objectives The objectives of this laboratory are the following: To design

More information

A Dynamic Link Allocation Router

A Dynamic Link Allocation Router A Dynamic Link Allocation Router Wei Song and Doug Edwards School of Computer Science, the University of Manchester Oxford Road, Manchester M13 9PL, UK {songw, doug}@cs.man.ac.uk Abstract The connection

More information

Flip-Flops, Registers, Counters, and a Simple Processor

Flip-Flops, Registers, Counters, and a Simple Processor June 8, 22 5:56 vra235_ch7 Sheet number Page number 349 black chapter 7 Flip-Flops, Registers, Counters, and a Simple Processor 7. Ng f3, h7 h6 349 June 8, 22 5:56 vra235_ch7 Sheet number 2 Page number

More information

Nexus: An Asynchronous Crossbar Interconnect for Synchronous System-on-Chip Designs

Nexus: An Asynchronous Crossbar Interconnect for Synchronous System-on-Chip Designs Nexus: An Asynchronous Crossbar Interconnect for Synchronous System-on-Chip Designs Andrew Lines Fulcrum Microsystems 26775 Malibu Hills Road, Calabasas, CA 9131 lines@fulcrummicrocom Abstract Asynchronous

More information

TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN

TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN USING DIFFERENT FOUNDRIES Priyanka Sharma 1 and Rajesh Mehra 2 1 ME student, Department of E.C.E, NITTTR, Chandigarh, India 2 Associate Professor, Department

More information

White Paper Understanding Metastability in FPGAs

White Paper Understanding Metastability in FPGAs White Paper Understanding Metastability in FPGAs This white paper describes metastability in FPGAs, why it happens, and how it can cause design failures. It explains how metastability MTBF is calculated,

More information

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies ETEC 2301 Programmable Logic Devices Chapter 10 Counters Shawnee State University Department of Industrial and Engineering Technologies Copyright 2007 by Janna B. Gallaher Asynchronous Counter Operation

More information

Sequential Circuits. Combinational Circuits Outputs depend on the current inputs

Sequential Circuits. Combinational Circuits Outputs depend on the current inputs Principles of VLSI esign Sequential Circuits Sequential Circuits Combinational Circuits Outputs depend on the current inputs Sequential Circuits Outputs depend on current and previous inputs Requires separating

More information

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra

More information

Design and analysis of flip flops for low power clocking system

Design and analysis of flip flops for low power clocking system Design and analysis of flip flops for low power clocking system Gabariyala sabadini.c PG Scholar, VLSI design, Department of ECE,PSNA college of Engg and Tech, Dindigul,India. Jeya priyanka.p PG Scholar,

More information

7. Latches and Flip-Flops

7. Latches and Flip-Flops Chapter 7 Latches and Flip-Flops Page 1 of 18 7. Latches and Flip-Flops Latches and flip-flops are the basic elements for storing information. One latch or flip-flop can store one bit of information. The

More information

Alpha CPU and Clock Design Evolution

Alpha CPU and Clock Design Evolution Alpha CPU and Clock Design Evolution This lecture uses two papers that discuss the evolution of the Alpha CPU and clocking strategy over three CPU generations Gronowski, Paul E., et.al., High Performance

More information

Signal integrity in deep-sub-micron integrated circuits

Signal integrity in deep-sub-micron integrated circuits Signal integrity in deep-sub-micron integrated circuits Alessandro Bogliolo abogliolo@ing.unife.it Outline Introduction General signaling scheme Noise sources and effects in DSM ICs Supply noise Synchronization

More information

Serial port interface for microcontroller embedded into integrated power meter

Serial port interface for microcontroller embedded into integrated power meter Serial port interface for microcontroller embedded into integrated power meter Mr. Borisav Jovanović, Prof. dr. Predrag Petković, Prof. dr. Milunka Damnjanović, Faculty of Electronic Engineering Nis, Serbia

More information

Glitch Free Frequency Shifting Simplifies Timing Design in Consumer Applications

Glitch Free Frequency Shifting Simplifies Timing Design in Consumer Applications Glitch Free Frequency Shifting Simplifies Timing Design in Consumer Applications System designers face significant design challenges in developing solutions to meet increasingly stringent performance and

More information

Model-Based Synthesis of High- Speed Serial-Link Transmitter Designs

Model-Based Synthesis of High- Speed Serial-Link Transmitter Designs Model-Based Synthesis of High- Speed Serial-Link Transmitter Designs Ikchan Jang 1, Soyeon Joo 1, SoYoung Kim 1, Jintae Kim 2, 1 College of Information and Communication Engineering, Sungkyunkwan University,

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

Multiple clock domains

Multiple clock domains DESIGNING A ROBUST USB SERIAL INTERFACE ENGINE(SIE) What is the SIE? A typical function USB hardware interface is shown in Fig. 1. USB Transceiver USB Serial Interface Engine Status Control Data Buffers/

More information

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw. Sequential Circuit

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw. Sequential Circuit Modeling Sequential Elements with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw 4-1 Sequential Circuit Outputs are functions of inputs and present states of storage elements

More information

Measuring Metastability

Measuring Metastability Measuring Metastability Sandeep Mandarapu Department of Electrical and Computer Engineering, VLSI Design Research Laboratory, Southern Illinois University Edwardsville, Illinois, USA, 62025 ECE595: Masters

More information

System Level Power and Performance Modeling of GALS Point-to-point Communication Interfaces *

System Level Power and Performance Modeling of GALS Point-to-point Communication Interfaces * System Level Power and Performance Modeling of GALS Point-to-point Communication Interfaces * Koushik Niyogi Diana Marculescu Electrical and Computer Engineering Electrical and Computer Engineering Carnegie

More information

Learning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design

Learning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design Learning Outcomes Simple CPU Operation and Buses Dr Eddie Edwards eddie.edwards@imperial.ac.uk At the end of this lecture you will Understand how a CPU might be put together Be able to name the basic components

More information

NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter

NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter Description: The NTE2053 is a CMOS 8 bit successive approximation Analog to Digital converter in a 20 Lead DIP type package which uses a differential

More information

Abstract. Cycle Domain Simulator for Phase-Locked Loops

Abstract. Cycle Domain Simulator for Phase-Locked Loops Abstract Cycle Domain Simulator for Phase-Locked Loops Norman James December 1999 As computers become faster and more complex, clock synthesis becomes critical. Due to the relatively slower bus clocks

More information

Design Verification & Testing Design for Testability and Scan

Design Verification & Testing Design for Testability and Scan Overview esign for testability (FT) makes it possible to: Assure the detection of all faults in a circuit Reduce the cost and time associated with test development Reduce the execution time of performing

More information

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE CHAPTER 5 71 FINITE STATE MACHINE FOR LOOKUP ENGINE 5.1 INTRODUCTION Finite State Machines (FSMs) are important components of digital systems. Therefore, techniques for area efficiency and fast implementation

More information

AC 2011-1060: ELECTRICAL ENGINEERING STUDENT SENIOR CAP- STONE PROJECT: A MOSIS FAST FOURIER TRANSFORM PROCES- SOR CHIP-SET

AC 2011-1060: ELECTRICAL ENGINEERING STUDENT SENIOR CAP- STONE PROJECT: A MOSIS FAST FOURIER TRANSFORM PROCES- SOR CHIP-SET AC 2011-1060: ELECTRICAL ENGINEERING STUDENT SENIOR CAP- STONE PROJECT: A MOSIS FAST FOURIER TRANSFORM PROCES- SOR CHIP-SET Peter M Osterberg, University of Portland Dr. Peter Osterberg is an associate

More information

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad A. M. Niknejad University of California, Berkeley EE 100 / 42 Lecture 24 p. 1/20 EE 42/100 Lecture 24: Latches and Flip Flops ELECTRONICS Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad University of California,

More information

Computer Organization & Architecture Lecture #19

Computer Organization & Architecture Lecture #19 Computer Organization & Architecture Lecture #19 Input/Output The computer system s I/O architecture is its interface to the outside world. This architecture is designed to provide a systematic means of

More information

Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks

Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks Cheoljoo Jeong Steven M. Nowick Department of Computer Science Columbia University Outline Introduction Background Technology

More information

A 3.2Gb/s Clock and Data Recovery Circuit Without Reference Clock for a High-Speed Serial Data Link

A 3.2Gb/s Clock and Data Recovery Circuit Without Reference Clock for a High-Speed Serial Data Link A 3.2Gb/s Clock and Data Recovery Circuit Without Reference Clock for a High-Speed Serial Data Link Kang jik Kim, Ki sang Jeong, Seong ik Cho The Department of Electronics Engineering Chonbuk National

More information

MPC8245/MPC8241 Memory Clock Design Guidelines: Part 1

MPC8245/MPC8241 Memory Clock Design Guidelines: Part 1 Freescale Semiconductor AN2164 Rev. 4.1, 03/2007 MPC8245/MPC8241 Memory Clock Design Guidelines: Part 1 by Esther C. Alexander RISC Applications, CPD Freescale Semiconductor, Inc. Austin, TX This application

More information

Switch Fabric Implementation Using Shared Memory

Switch Fabric Implementation Using Shared Memory Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today

More information

A Novel Low Power Fault Tolerant Full Adder for Deep Submicron Technology

A Novel Low Power Fault Tolerant Full Adder for Deep Submicron Technology International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-1 E-ISSN: 2347-2693 A Novel Low Power Fault Tolerant Full Adder for Deep Submicron Technology Zahra

More information

The Future of Multi-Clock Systems

The Future of Multi-Clock Systems NEL FREQUENCY CONTROLS, INC. 357 Beloit Street P.O. Box 457 Burlington,WI 53105-0457 Phone:262/763-3591 FAX:262/763-2881 Web Site: www.nelfc.com Internet: sales@nelfc.com The Future of Multi-Clock Systems

More information

LatticeECP3 High-Speed I/O Interface

LatticeECP3 High-Speed I/O Interface April 2013 Introduction Technical Note TN1180 LatticeECP3 devices support high-speed I/O interfaces, including Double Data Rate (DDR) and Single Data Rate (SDR) interfaces, using the logic built into the

More information

WEEK 8.1 Registers and Counters. ECE124 Digital Circuits and Systems Page 1

WEEK 8.1 Registers and Counters. ECE124 Digital Circuits and Systems Page 1 WEEK 8.1 egisters and Counters ECE124 igital Circuits and Systems Page 1 Additional schematic FF symbols Active low set and reset signals. S Active high set and reset signals. S ECE124 igital Circuits

More information

Computer Systems Structure Input/Output

Computer Systems Structure Input/Output Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices

More information

A 10,000 Frames/s 0.18 µm CMOS Digital Pixel Sensor with Pixel-Level Memory

A 10,000 Frames/s 0.18 µm CMOS Digital Pixel Sensor with Pixel-Level Memory Presented at the 2001 International Solid State Circuits Conference February 5, 2001 A 10,000 Frames/s 0.1 µm CMOS Digital Pixel Sensor with Pixel-Level Memory Stuart Kleinfelder, SukHwan Lim, Xinqiao

More information

Managing High-Speed Clocks

Managing High-Speed Clocks Managing High-Speed s & Greg Steinke Director, Component Applications Managing High-Speed s Higher System Performance Requires Innovative ing Schemes What Are The Possibilities? High-Speed ing Schemes

More information

TI GPS PPS Timing Application Note

TI GPS PPS Timing Application Note Application Note Version 0.6 January 2012 1 Contents Table of Contents 1 INTRODUCTION... 3 2 1PPS CHARACTERISTICS... 3 3 TEST SETUP... 4 4 PPS TEST RESULTS... 6 Figures Figure 1 - Simplified GPS Receiver

More information

Lecture-3 MEMORY: Development of Memory:

Lecture-3 MEMORY: Development of Memory: Lecture-3 MEMORY: It is a storage device. It stores program data and the results. There are two kind of memories; semiconductor memories & magnetic memories. Semiconductor memories are faster, smaller,

More information

Synchronization of sampling in distributed signal processing systems

Synchronization of sampling in distributed signal processing systems Synchronization of sampling in distributed signal processing systems Károly Molnár, László Sujbert, Gábor Péceli Department of Measurement and Information Systems, Budapest University of Technology and

More information

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces White Paper Introduction The DDR3 SDRAM memory architectures support higher bandwidths with bus rates of 600 Mbps to 1.6 Gbps (300 to 800 MHz), 1.5V operation for lower power, and higher densities of 2

More information

2.1 CAN Bit Structure The Nominal Bit Rate of the network is uniform throughout the network and is given by:

2.1 CAN Bit Structure The Nominal Bit Rate of the network is uniform throughout the network and is given by: Order this document by /D CAN Bit Timing Requirements by Stuart Robb East Kilbride, Scotland. 1 Introduction 2 CAN Bit Timing Overview The Controller Area Network (CAN) is a serial, asynchronous, multi-master

More information

So far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs.

So far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs. equential Logic o far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs. In sequential logic the output of the

More information

Latches, the D Flip-Flop & Counter Design. ECE 152A Winter 2012

Latches, the D Flip-Flop & Counter Design. ECE 152A Winter 2012 Latches, the D Flip-Flop & Counter Design ECE 52A Winter 22 Reading Assignment Brown and Vranesic 7 Flip-Flops, Registers, Counters and a Simple Processor 7. Basic Latch 7.2 Gated SR Latch 7.2. Gated SR

More information

LOW POWER DESIGN OF DIGITAL SYSTEMS USING ENERGY RECOVERY CLOCKING AND CLOCK GATING

LOW POWER DESIGN OF DIGITAL SYSTEMS USING ENERGY RECOVERY CLOCKING AND CLOCK GATING LOW POWER DESIGN OF DIGITAL SYSTEMS USING ENERGY RECOVERY CLOCKING AND CLOCK GATING A thesis work submitted to the faculty of San Francisco State University In partial fulfillment of the requirements for

More information

Clock Recovery in Serial-Data Systems Ransom Stephens, Ph.D.

Clock Recovery in Serial-Data Systems Ransom Stephens, Ph.D. Clock Recovery in Serial-Data Systems Ransom Stephens, Ph.D. Abstract: The definition of a bit period, or unit interval, is much more complicated than it looks. If it were just the reciprocal of the data

More information

On-chip clock error characterization for clock distribution system

On-chip clock error characterization for clock distribution system On-chip clock error characterization for clock distribution system Chuan Shan, Dimitri Galayko, François Anceau Laboratoire d informatique de Paris 6 (LIP6) Université Pierre & Marie Curie (UPMC), Paris,

More information

DIGITAL COUNTERS. Q B Q A = 00 initially. Q B Q A = 01 after the first clock pulse.

DIGITAL COUNTERS. Q B Q A = 00 initially. Q B Q A = 01 after the first clock pulse. DIGITAL COUNTERS http://www.tutorialspoint.com/computer_logical_organization/digital_counters.htm Copyright tutorialspoint.com Counter is a sequential circuit. A digital circuit which is used for a counting

More information

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.7

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.7 ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.7 4.7 A 2.7 Gb/s CDMA-Interconnect Transceiver Chip Set with Multi-Level Signal Data Recovery for Re-configurable VLSI Systems

More information

TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING

TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING BARIS TASKIN, JOHN WOOD, IVAN S. KOURTEV February 28, 2005 Research Objective Objective: Electronic design automation

More information

PROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics

PROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics PROGETTO DI SISTEMI ELETTRONICI DIGITALI Digital Systems Design Digital Circuits Advanced Topics 1 Sequential circuit and metastability 2 Sequential circuit A Sequential circuit contains: Storage elements:

More information

Master/Slave Flip Flops

Master/Slave Flip Flops Master/Slave Flip Flops Page 1 A Master/Slave Flip Flop ( Type) Gated latch(master) Gated latch (slave) 1 Gate Gate GATE Either: The master is loading (the master in on) or The slave is loading (the slave

More information

DDR subsystem: Enhancing System Reliability and Yield

DDR subsystem: Enhancing System Reliability and Yield DDR subsystem: Enhancing System Reliability and Yield Agenda Evolution of DDR SDRAM standards What is the variation problem? How DRAM standards tackle system variability What problems have been adequately

More information

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology Nahid Rahman Department of electronics and communication FET-MITS (Deemed university), Lakshmangarh, India B. P. Singh Department

More information

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa Experiment # 9 Clock generator circuits & Counters Eng. Waleed Y. Mousa 1. Objectives: 1. Understanding the principles and construction of Clock generator. 2. To be familiar with clock pulse generation

More information

An Ultra-low low energy asynchronous processor for Wireless Sensor Networks

An Ultra-low low energy asynchronous processor for Wireless Sensor Networks An Ultra-low low energy asynchronous processor for Wireless Sensor Networks.Necchi,.avagno, D.Pandini,.Vanzago Politecnico di Torino ST Microelectronics Wireless Sensor Networks - Ad-hoc wireless networks

More information

A PC-BASED TIME INTERVAL COUNTER WITH 200 PS RESOLUTION

A PC-BASED TIME INTERVAL COUNTER WITH 200 PS RESOLUTION 35'th Annual Precise Time and Time Interval (PTTI) Systems and Applications Meeting San Diego, December 2-4, 2003 A PC-BASED TIME INTERVAL COUNTER WITH 200 PS RESOLUTION Józef Kalisz and Ryszard Szplet

More information

Implementation of Full -Parallelism AES Encryption and Decryption

Implementation of Full -Parallelism AES Encryption and Decryption Implementation of Full -Parallelism AES Encryption and Decryption M.Anto Merline M.E-Commuication Systems, ECE Department K.Ramakrishnan College of Engineering-Samayapuram, Trichy. Abstract-Advanced Encryption

More information

Counters and Decoders

Counters and Decoders Physics 3330 Experiment #10 Fall 1999 Purpose Counters and Decoders In this experiment, you will design and construct a 4-bit ripple-through decade counter with a decimal read-out display. Such a counter

More information

A Tree Arbiter Cell for High Speed Resource Sharing in Asynchronous Environments

A Tree Arbiter Cell for High Speed Resource Sharing in Asynchronous Environments A Tree Arbiter ell for High Speed Resource Sharing in Asynchronous Environments Syed Rameez Naqvi and Andreas Steininger Department of omputer Engineering, Vienna University of Technology, Austria {rnaqvi,

More information

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001 Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering

More information

DS2155 T1/E1/J1 Single-Chip Transceiver

DS2155 T1/E1/J1 Single-Chip Transceiver www.maxim-ic.com ERRATA SHEET DS2155 T1/E1/J1 Single-Chip Transceiver REVISION A3 ERRATA The errata listed below describe situations where DS2155 revision A3 components perform differently than expected

More information

Set-Reset (SR) Latch

Set-Reset (SR) Latch et-eset () Latch Asynchronous Level sensitive cross-coupled Nor gates active high inputs (only one can be active) + + Function 0 0 0 1 0 1 eset 1 0 1 0 et 1 1 0-? 0-? Indeterminate cross-coupled Nand gates

More information

Chapter 13: Verification

Chapter 13: Verification Chapter 13: Verification Prof. Ming-Bo Lin Department of Electronic Engineering National Taiwan University of Science and Technology Digital System Designs and Practices Using Verilog HDL and FPGAs @ 2008-2010,

More information