Online Clock Routing in Xilinx FPGAs for High-Performance and Reliability

Size: px
Start display at page:

Download "Online Clock Routing in Xilinx FPGAs for High-Performance and Reliability"

Transcription

1 Online Clock Routing in Xilinx FPGAs for High-Performance and Reliability Xabier Iturbe, Khaled Benkrid, Raul Torrego, Ali Ebrahim and Tughrul Arslan System Level Integration Group, The University of Edinburgh, Edinburgh EH9 3JL, Scotland, UK {x.iturbe, k.benkrid, a.ebrahim, Embedded System-on-Chip Group, IKERLAN-IK4 Research Alliance, Mondragón 20500, Basque Country (Spain) {xiturbe, Abstract In this paper, we report the design and implementation of a reconfigurable system that exploits regional clocking resources that exist in Xilinx Virtex-4 FPGAs for increased performance and, for the first time, enhanced reliability. Unlike previous approaches, our system is able to individually manage the regional clock buffers (BUFRs) to adjust the frequency delivered to each hardware task and to detect and recover from faults affecting the clock-tree on-the-fly. Towards this end, we propose global and regional clock multiplexers, named GCMUX and RCMUX respectively, which allow for switching to spare clocking resources whenever needed. These multiplexers are based on the inner programmable interconnection points of the FPGA, leading to zero area overheads. I. INTRODUCTION State of the art trends in Reconfigurable Computing (RC) envision substantial gains in performance, reliability and power efficiency over traditional systems by customizing at runtime the underlying architecture of the system to match the specific needs of a given application [1]. Different pieces of circuitry, specifically designed for efficiently implementing each type of computation, are allocated on a dynamically reconfigurable FPGA, executed and finally replaced by other circuits, leading to a continuous stream of input operands, computation and output results. Analogously to the software field, these swappable pieces of circuitry are named as hardware tasks. Current research efforts try to improve the performance of RC in each of the domains where computation occurs, in space and time [2]. In the space domain, they are aimed to increase the allocatability of the hardware tasks, reducing the fragmentation in the chip. In the time domain, the efforts concentrate in exploiting the parallelism delivered by the FPGA. This includes both process-level parallelism or multitasking, where the objective is to make the highest amount of hardware tasks run simultaneously, and data-level parallelism, where the objective is to build efficient architectures able to exploit the high-bandwidth offered by the tens of thousands logic blocks and memories included in modern FPGAs. However, these approaches can highly benefit from advances in precisely the key factor for computing speed in traditional processors: the clock frequency. Indeed, the highest clock frequency a hardware task can run depends on the maximum delay between its sequential components, the socalled longest path, which usually depends on the complexity of the task itself. As a result, several hardware tasks which can run at different clock frequencies are typically found in an RC application. Clocking the system at the slowest rate is the easy option that is often chosen, but it is not the most efficient. Hence, the open question here is, how to deal with frequency-heterogeneity to improve the performance of an RC application? Another issue of concern in RC is fault-tolerance. Researchers have traditionally focused on scrubbing bit upsets [3] and reconfiguring around damaged resources [4], but little attention has been put on common-source failures such as clocking distribution. However, the clock-tree is a single-point of failure and must be carefully hardened to increase the reliability of the system [5]. Modern families of Xilinx FPGAs include enhanced clocking capabilities that can be useful when dealing with the above-mentioned problems. These devices permit to independently handle different portions of the device s reconfigurable area, the so-called clock-regions [7]. Each clock region includes specific clocking resources and thus, the clock-tree is not a single resource that must be managed as a whole. Instead it is divided into several branches that feed the hardware tasks. However, most of RC systems do not still exploit these capabilities offered by new FPGAs. This paper is aimed at exploiting the multi branched clocktree to increase performance and reliability in an RC system. The work reported in this paper is part of a larger effort in our group which aims to implement OS-like support to develop reconfigurable applications using Xilinx FPGAs; e.g. support for task scheduling, allocation, deallocation, inter-task communications and synchronization, etc. Our OS is named as Reliable Reconfigurable Real-Time Operating System (R3TOS) [6], emphasizing its three major features: reconfigurability, reliability and real-time performance. The main contributions of this paper are twofold: The implementation of an RC system able to manage the regional clocking resources at runtime to make each hardware task run at its maximum frequency. A novel method to detect and recover at runtime from a fault affecting the clock-tree.

2 The remainder of the paper is organized as follows. Section II introduces the architecture of partially reconfigurable Xilinx FPGAs, making special emphasis on the clock-tree, and reviews related work. Next, in Section III, the main contributions of this paper are described by means of a proof-of-concept implementation and finally, conclusions are pointed out in Section IV. II. DYNAMICALLY RECONFIGURABLE XILINX FPGAS Xilinx FPGAs include an array of Configurable Logic Blocks (CLBs), Input-Output Blocks (IOBs), routing resources, a clock-tree and some special resources (e.g. Block- RAM memories, DSPs) [7]. A. The Clock-Tree Fig. 1a shows the global clocking resources in a Virtex-4 fabric. 32 matched-skew global nets are driven by a global clock buffer (BUFGCTRL) each, which can select between two input clock sources. Usually these clock sources are Digital Clock Managers (DCMs), used to eliminate the clock distribution delay, adjust its delay relative to another clock, or adjust the frequency of a clock source. IOBs can be directly connected to BUFGCTRLs as well. The Programmable Interconnection Points (PIPs) that distribute the output signals of the BUFGTRL to the clock regions are especially interesting in this figure. These PIPs act as Global Clock MUltipleXers (GCMUXs), selecting up to 8 out of the 32 input global signals to feed the sequential components in the device. Starting from Virtex-4, Xilinx FPGAs are divided into different clock regions to improve the clocking distribution (See Fig. 1b). The number of regions varies with device size, 8 regions in the smallest device to 24 regions in the largest one. Each clock region includes two independent regional clock nets (Net1 and Net2) and two regional clock buffers (BUFR1 and BUFR2) with the capability to divide the input clock rate by any integer number, named as BUFR DIVIDE, which can range from 1 to 8 (See Fig. 1c). This input clock signal can come from an IOB, through an I/O clock Buffer (BUFIO), or directly from a BUFGCTRL. Especially interesting are the PIPs to connect the output signals of the BUFR with the regional clock nets. These PIPs act as a Regional Clock MUltipleXer (RCMUX), selecting up to 2 out of 6 different input clock signals, 2 coming directly from the BUFRs in the same clock region and the other 4 coming from the BUFRs in the adjacent up and down clock regions. Hence, each BUFR can drive up to three adjacent clock regions. Summing up, the clock-tree of a Xilinx FPGA is hierarchically structured: Clock Source (DCM) BUFGCTRL GCMUX BUFR RCMUX (See Fig. 1d). B. Dynamic Partial Reconfiguration and Online Task Reallocation The physical resources of a Xilinx FPGA are configured by means of a bitstream, that is stored in a configuration memory. Dynamic Partial Reconfiguration (DPR) consists in changing the content of some positions of the configuration memory (a) Simplified scheme of the global clocking-tree (b) Regional clocking resources are located in the middle of each clock region while the global clocking resources are located in the middle of the die (c) Simplified scheme of a regional clocking branch (d) Clock-tree hierarchy Fig. 1: Clocking in a Virtex-4 XC4VFX12 FPGA

3 at runtime, thus changing the functionality implemented by a portion of the device. Access to memory is carried out through the so-called Internal Configuration Access Port (ICAP). For Xilinx FPGAs the smallest amount of configuration information that can be accessed in the configuration memory is the configuration frame. Each configuration frame spans the whole height of a fabric clock region, defining the minimum piece of resources modifiable when using DPR [8]. In Virtex-4 FPGAs each frame includes 1312 bits and is addressed by a 32-bit address which include five fields: (a) block resource type, (b) top / bottom half, (c) clock region, (d) major column address, which identifies the column within the clock region, and (e) minor intra-column address, which identifies the specific frame within the column. Note that fields (a), (b), (c) and (d) are related to the type and location of the resources the frame configures. For instance, BUFRs and RCMUXes are mapped into IOB type frames in the leftmost and rightmost columns, while BUFGCTRLs and GCMUXes are mapped into dedicated clocking type frames in the middle column. Due to the relation between the physical location of the resources in the FPGA die and the logical location of the frames in the configuration memory, a hardware task can be physically relocated to different positions by simply writing its associated partial bitstream to the appropriate configuration frames [9]. The only condition is that the target position must be identical to the original one in the type and arrangement of resources as well as communication interfaces it contains. In order to circumvent the latter, i.e. identical communication interface between the source and target positions, we have recently proposed a technique that harnesses the ICAP for performing inter-task communication and synchronization, eliminating thus the needs for fixed proxy logic [10]. To support this, we have developed a generic Communication Interface (CIF) to be attached to each hardware task in the system. The CIF includes a set of input/output data buffers, which can be accessed through the configuration layer regardless of the final placement of the tasks within the FPGA; i.e. data to process is written to the input buffer and results are retrieved from the output buffer. Consequently, the CIF makes hardware tasks closed fully relocatable structures where the clock is the only signal that crosses their boundaries. Note that when all of the tasks use the same clock signal, distributed through the global clock net, this scheme does not constrain their allocatability. The problem appears when dealing with multiple clock signals feeding each of the tasks. In this case the global clock net cannot be used. Since the clock signals cannot be routed through conventional resources (e.g. CLBs) because of the high skew and timing instabilities this provokes, the only valid alternative is using the regional clocking resources. C. Related Work Xilinx provides practical information on the use of BUFRs in a partially reconfigurable design in [11] and a successful implementation is reported in [12], [13]. Basically, these approaches include the BUFRs as a component of the hardware tasks. Despite being functional, this solution limits the efficiency of the system described in [13]: As the partial bitstream of the hardware tasks include the configuration of the BUFRs, each clock region can host only those tasks running at the same clock frequency; Every time a new task is configured in a clock region the configuration of the BUFR in that region is overwritten. As the tasks can be allocated only to positions where BUFRs are located, they cannot be horizontally shifted inside a clock region. As the hardware tasks are driven by a single BUFR, their height must be no greater than 3 clock regions. As the clock rate is switched in the BUFGCTRLs, only two different frequencies can be selected for the tasks; BUFGCTRL have only two clock inputs. III. OUR PROOF-OF-CONCEPT SYSTEM IMPLEMENTATION On the contrary to the approaches presented in Section II, we propose to keep BUFRs separate from the hardware tasks. This represents what really occurs in a better way: the regional clocking resources are not dedicated to any specific hardware task, but shared among all of the tasks placed in the same clock region. By doing so, tasks can be clocked using multiple BUFRs located in different clock regions and thus, their maximum height is not constrained. While all of the clock signals delivered to a task must be of the same frequency, as shown in Fig. 2, they can be routed through different regional clock nets in each clock region (e.g. Net1 or Net2). Moreover, not including the BUFRs in the architecture of the tasks enhances their (horizontal) allocatability. We also propose to adjust the configuration of the BUFRs at runtime by changing the BUFR DIVIDE parameter. This permits to generate up to 8 different clock frequencies. Finally, we propose to route the clock signal on-the-fly in order to switch away from failed clocking resources. Fig. 2: Clock feeding to a hardware task using the BUFRs We note that the hardware tasks include a RAM-LUT which is remotely writable through the ICAP, the so-called Hardware Semaphore (HWS). The latter acts as the internal reset for all of the sequential components in a task, allowing to delay the computation until all of the clock signals are correctly set-up

4 for that task; i.e. the task is kept in reset start while configuring its clock signals. This Section describes a proof-of-concept system that demonstrates the feasibility of these ideas, using a Virtex-4 XC4VFX12 part. A. Static Infrastructure Synthesis The static infrastructure gives support for the execution of the hardware tasks. It includes a the R3TOS kernel, which acts as the brain of the system, and the clocking resources, which deliver the clock signal along the chip. The R3TOS kernel offers inter-task communication and synchronization services and manages the clock-tree based on performance and reliability premises. Since most of these functions are carried out through the configuration layer of the FPGA, it includes an ICAP instance as well. The static infrastructure also includes the instantiation of all the BUFRs located in the leftmost and rightmost columns of the FPGA, which are fed by the same BUFGCTRL. For reliability purposes, each input of the BUFGCTRL is connected to an independent clock source. Moreover, a BUFR diagnostic circuit is included in each clock region to detect damaged regional clocking resources (See Fig. 3). Initially the two BUFRs in each clock region are directly connected to the two regional clock nets. As the BUFRs do not drive any static logic, we must prevent their removal during the synthesis by including an S=TRUE attribute on them. Static routing is a critical issue when building an RC system as the relocated hardware tasks may use the routing resources already assigned to static routes in the target position. To limit the static routing in our design the R3TOS kernel is defined as partially reconfigurable and constrained to a specific closed region. As a result, it is self-contained in that region and the rest of the chip is static route-free. The location of the placement region for the kernel is set between the ICAP (located in the centre of the FPGA die) and the IOBs to be used (located in the right border of the chip). Since Xilinx design tools do not allow to include neither IOBs nor the ICAP within a partially reconfigurable region, these components are kept outside the region assigned to the R3TOS circuitry, being connected to the latter by means of Bus Macros (BMs). As shown in Fig. 3, the only signals which extend beyond the boundaries of the region assigned to the R3TOS kernel are the clock lines, which indeed do not constrain the allocatability of the tasks as they are separately managed by R3TOS. B. Hardware Task Synthesis The hardware tasks, fed by a BUFR, are separately synthesized and constrained to specific closed regions within the FPGA. Note that the clock route inside the task is adjusted at runtime (See Section III-C). C. Adjusting Task s Clock Frequency Up to two different clock frequencies can be distributed through each of the two regional clock nets existing in a clock region. The process of delivering a specific clock signal to a hardware task is done in two steps. (a) Placed design (b) Routed design Fig. 3: Layout of the static infrastructure First, one of the BUFRs in each of the clock regions the task spans are appropriately configured in the configuration memory. BUFR DIVIDE parameter is coded using 4 bits. For BUFR1s, these bits are in positions [ ] within the IOB type frame with MINOR=14 and for BUFR2s, the

5 bits are in identical positions within the IOB type frame with MINOR=22. Table I shows the configuration values to be written in these positions to select each clock frequency. TABLE I: BUFR DIVIDE configuration values BUFR DIVIDE Divided by 1 Divided by 2 Divided by 3 Divided by 4 Divided by 5 Divided by 6 Divided by 7 Divided by 8 Value 0x8 0x9 0xA 0xB 0xC 0xD 0xE 0xF Then, the clock signal is routed inside the task to feed all of its sequential components; i.e. the components must be driven by the regional clock net connected to the previously configured BUFR. To select the clock Net1 as the clock source for a resource column, the bit in position 655 within the frames with MINOR=18 is activated and, clock Net2 is selected by activating the bit in position 654 within the same frames. This applies for CLB, DSP and BRAM interconnection type frames. Finally, the clock signal is routed through the switch matrices associated to each resource within the column. The position of the bits to be activated to do so is shown in table II. frequency, the values in the latches can never be the same, except when any either clk_i0 or clk_i1 do not work; i.e. they are stuck-at the same logic level. Hence, the XOR between the values stored in the latches is used to select the clock source in the BUFGCTRL, automatically switching to clk_i0 when clk_i1 fails. In order to circumvent problems when the two clock sources are not synchronized, the BUFGCTRL clock selection port is not directly driven by the XOR logic gate. The output of this gate is delayed at least one clock cycle, ensuring there is sufficient guard time between the last clock pulse generated by the active clock source before it gets damaged, and the clock pulse immediately after, which is generated by the switched clock source. Finally, we note that errors due to phase and frequency deviations in the clock sources are not directly handled by the DMC circuit, but they can be detected by DCMs via their LOCKED output. Despite the DMC circuit is able to recover from a single failed clock source, it is convenient that the R3TOS kernel is aware of the latter situation with the objective of preventing potential system failures; i.e. when the remaining clock source fails. Therefore, the signal to drive the BUFGCTRL clock selection (error) is delivered to the R3TOS kernel to enable keeping track of the state of the clock sources. TABLE II: Configuration values to route the clock signals Net1 Net2 MINOR Bit position n m n+80m 0 n 3 0 m n+80m 0 n 3 9 m n+80m 46+8n+80m 0 n 1 0 m n+80m 78+8n+80m 0 n 1 9 m n+80m 0 n 3 0 m n+80m 0 n 3 9 m n+80m 41+8n+80m 0 n 1 0 m n+80m 73+8n+80m 0 n 1 9 m 15 In order to reduce power consumption, when only one frequency is needed in a clock region, the other BUFR is disabled. For BUFR1s, this is done by writing 0 in position 654 within the IOB type frames with MINOR=19 and MI- NOR=21. For BUFR2s, the bits to be cleared are located in position 653 in the same frames. D. Dealing with Faults in the Clock-Tree For reliability purposes, each input of the BUFGCTRL, I0 and I1, is connected to an independent clock source, clk_i0 and clk_i1, respectively. The clock selection port of the BUFGCTRL is driven by the Dual-clock Management Circuit (DMC) depicted in Fig. 4. This circuit takes both clock sources, clk_i0 and clk_i1, as inputs. Specifically, clk_i0 is used as the clock in the circuit and clk_i1 is captured in two latches, each working at falling and rising edges. Assuming both input clock signals are of the same (a) Schematic clk i0 clk i1 error clk o OK OK 1 clk i1 OK X 0 clk i0 X OK 1 clk i1 X X X clk i1 (b) Functioning Fig. 4: Dual-clock Management Circuit (DMC) To detect damaged regional clocking resources, a diagnostic circuit is implemented in each clock region, next to the BUFRs. As shown in Fig. 5 this circuit is very simple, fitting in only 8 Slices and thus spanning only one CLB column. It takes two clock sources as inputs: the global clock signal which is input to the BUFR to diagnose and the regional clock signal which is output. The global clock feeds a delay line composed of 5 cascaded latches which capture the regional clock signal at rising edges, and another latch which capture the regional clock signal at falling edge. As the frequency of the regional clock signal can be equal or up to 8 times slower than that of the global clock, the values stored in all of the latches can never be the same. Note that when their frequencies are the same, the values captured at falling and rising edges must be different, and when their frequencies are different, the 5 delay line latches store values corresponding to more than a half

6 period of the regional clock signal and hence, at least one of them must be different. Therefore, the error signal must be always 0. If this is not met, then the regional clock signal is stuck at the same logic level, i.e. the BUFR is broken. In order to make the diagnostic result remotely accessible from the R3TOS kernel without using static routes across the chip, the error signal is registered in a RAM-LUT, whose content can be read-back using the ICAP. Once the latter RAM-LUT is accessed, the R3TOS kernel is responsible for clearing it back to 0. (a) Normal configuration Fig. 5: BUFR diagnostic circuit When a BUFR is damaged, the RCMUXes are used to switch a signal coming from any of the other 4 BUFRs located in the up and down clock regions. The RCMUX configuration values are coded using 4 bits located in positions [ ] within two different IOB type frames. MINOR=23 configures the clock Net1 and MINOR=25 configures the clock Net2. Table III shows the values to be written in these positions to select a specific input clock source in the RCMUXes. (b)...with unused resources in large tasks spanning regions TABLE III: RCMUX configuration values Input BUFR1 (down clock region) BUFR2 (down clock region) BUFR1 BUFR2 BUFR1 (up clock region) BUFR2 (up clock region) Value 0x8 0x9 0xA 0xB 0xC 0xD When any of the hardware task spans more than one clock region in height, non-damaged BUFRs of these regions can be used in the regions with damaged BUFRs (See Fig. 6b). Likewise, one of the BUFRs in a region where only one clock frequency is needed can feed the adjacent clock regions with damaged BUFRs (See Fig. 6c). Despite being more complex to implement, the same strategy can be used to switch a spare BUFGCTRL when one of the input clock sources to the active one fails; i.e. the system will fail if the remaining clock source also fails. Note that this diagnosis information can be obtained from the aforementioned DMC circuit. However, the global clock switch should be carefully done in order to avoid glitches that could potentially provoke functioning errors, e.g. wrong computation by the hardware tasks or ICAP stuck. Indeed, prior to adjusting the PIPs used to route the global clock signal, the ICAP Controller should be clocked from a different clock source and the active global clock source should be switched off. This can be implemented by taking advantage of the fact that modern FPGAs include two ICAP ports, which allows to (c)...with unused resources in regions where only one clock frequency is needed Fig. 6: Replacing damaged BUFRs... include two ICAP Controllers in the system to be fed with a different clock source (See Fig. 7). Hence, after switching the global clock using the spare ICAP Controller (See Fig. 7b), the main ICAP Controller should be again used to restore the clock diversity in the system (See Fig. 7c). By using all the techniques reported in this paper together a complete online clock routing mechanism is implemented, where some parts are automatically managed (BUFGCTRL) and the others need to be managed by the R3TOS kernel (clock sources, routing inside the tasks, BUFRs, RCMUX, GCMUX). IV. CONCLUSIONS In this paper we have described how to build an RC system able to feed each hardware task with the required clock frequency. We demonstrate that our system circumvents up to four limitations existing in previous related work. In addition, we have presented a novel mechanism to detect and recover from faults affecting the clock-tree of the underlying FPGA. This is considered significant in reliability-sensitive applications (e.g. space). Our mechanism enables the system to gain control over each of the resources in the FPGA clock-tree

7 (a) Initial situation (b) Phase 1 (c) Phase 2 Fig. 7: Global clock switch: clock source 1 clock source 3 hierarchy to respond at the appropriate level to each necessity. Our solution has been implemented using Xilinx Virtex-4 FPGAs and it is also extensible to newer Xilinx families, such as Virtex-5/6/7. Future work includes testing our RC system in various real-world applications. Last but not least, we note that the work reported in this paper is enshrined in the R3TOS project, which is aimed at building a Reliable Reconfigurable Real-Time Operating System for Xilinx partially reconfigurable FPGAs. R3TOS also includes a novel fault-handling strategy to cope with transient and permanent faults which affect both the hardware tasks and its own kernel circuitry. Furthermore, the R3TOS kernel includes a set of fault-tolerance by design features, including ECC protection of internal BRAMs and finite state machines. The fault-handling strategy used by R3TOS as well as its kernel implementation are to be described in future publications. REFERENCES [1] S. Hauck and A. DeHon, Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation. Morgan Kaufmann Publishers Inc., San Francisco, USA, [2] T. Marconi, Y. Lu, K.L.M. Bertels, G. N. Gaydadjiev, 3D Compaction: a Novel Blocking-aware Algorithm for Online Hardware Task Scheduling and Placement on 2D Partially Reconfigurable Devices, Proc. of the International Symposium on Applied Reconfigurable Computing, [3] M. Berg, C. Poivey, D. Petrick, D. Espinosa, A. Lesea, K.A. LaBel, M. Friendlich, H. Kim and A. Phan, Effectiveness of Internal versus External SEU Scrubbing Mitigation Strategies in a Xilinx FPGA: Design, Test, and Analysis, IEEE Transactions on Nuclear Science, 55(4): , [4] D. P. Montminy, R. O. Baldwin, P. D. Williams and B. E. Mullins, Using Relocatable Bitstreams for Fault Tolerance, Proc. of the NASA/ESA Conference on Adaptive Hardware and Systems, [5] H. Kopetz, Real-Time Systems: Design Principles for Distributed Embedded Applications, Kluwer Academic Publishers, Norwell, USA, [6] X. Iturbe, K. Benkrid, A. T. Erdogan, T. Arslan, M. Azkarate, I. Martinez and A. Perez, R3TOS: A Reliable Reconfigurable Real-Time Operating System, Proc. of the NASA/ESA Conference on Adaptive Hardware and Systems, [7] Xilinx Inc., Virtex-4 FPGA User Guide, UG070, [8] Xilinx Inc., Virtex-4 FPGA Configuration User Guide, UG071, [9] P. Sedcole, B. Blodget, T. Becker, J. Anderson, and P. Lysaght. Modular Dynamic Reconfiguration in Virtex FPGAs, IEE Proceedings Computers and Digital Techniques, 153(3): , [10] X. Iturbe, K. Benkrid, T. Arslan, R. Torrego and I. Martinez, Methods and Mechanisms for Hardware Multitasking: Executing and Synchronizing Fully Relocatable Hardware Tasks in Xilinx FPGAs, Proc. of the International Conference on Field-Programmable Logic and Applications, [11] E. Eto, Support for BUFR in Partial Reconfigurable Modules, Xilinx White Paper, WP344, [12] A. Flynn, A. Gordon-Ross and A. D. George, Bitstream Relocation with Local Clock Domains for Partially Reconfigurable FPGAs, Proc. of the Conference on Design, Automation and Test in Europe, [13] A. Jara-Berrocal and A. Gordon-Ross, VAPRES: A Virtual Architecture for Partially Reconfigurable Embedded Systems, Proc. of the Conference on Design, Automation and Test in Europe, 2010.

Lecture N -1- PHYS 3330. Microcontrollers

Lecture N -1- PHYS 3330. Microcontrollers Lecture N -1- PHYS 3330 Microcontrollers If you need more than a handful of logic gates to accomplish the task at hand, you likely should use a microcontroller instead of discrete logic gates 1. Microcontrollers

More information

Hardware Task Scheduling and Placement in Operating Systems for Dynamically Reconfigurable SoC

Hardware Task Scheduling and Placement in Operating Systems for Dynamically Reconfigurable SoC Hardware Task Scheduling and Placement in Operating Systems for Dynamically Reconfigurable SoC Yuan-Hsiu Chen and Pao-Ann Hsiung National Chung Cheng University, Chiayi, Taiwan 621, ROC. [email protected]

More information

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT 216 ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT *P.Nirmalkumar, **J.Raja Paul Perinbam, @S.Ravi and #B.Rajan *Research Scholar,

More information

FPGA area allocation for parallel C applications

FPGA area allocation for parallel C applications 1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University

More information

An Open Architecture through Nanocomputing

An Open Architecture through Nanocomputing 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore An Open Architecture through Nanocomputing Joby Joseph1and A.

More information

Chapter 7 Memory and Programmable Logic

Chapter 7 Memory and Programmable Logic NCNU_2013_DD_7_1 Chapter 7 Memory and Programmable Logic 71I 7.1 Introduction ti 7.2 Random Access Memory 7.3 Memory Decoding 7.5 Read Only Memory 7.6 Programmable Logic Array 77P 7.7 Programmable Array

More information

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada [email protected] Micaela Serra

More information

9/14/2011 14.9.2011 8:38

9/14/2011 14.9.2011 8:38 Algorithms and Implementation Platforms for Wireless Communications TLT-9706/ TKT-9636 (Seminar Course) BASICS OF FIELD PROGRAMMABLE GATE ARRAYS Waqar Hussain [email protected] Department of Computer

More information

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware Shaomeng Li, Jim Tørresen, Oddvar Søråsen Department of Informatics University of Oslo N-0316 Oslo, Norway {shaomenl, jimtoer,

More information

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana

More information

SOCWIRE: A SPACEWIRE INSPIRED FAULT TOLERANT NETWORK-ON-CHIP FOR RECONFIGURABLE SYSTEM-ON-CHIP DESIGNS

SOCWIRE: A SPACEWIRE INSPIRED FAULT TOLERANT NETWORK-ON-CHIP FOR RECONFIGURABLE SYSTEM-ON-CHIP DESIGNS SOCWIRE: A SPACEWIRE INSPIRED FAULT TOLERANT NETWORK-ON-CHIP FOR RECONFIGURABLE SYSTEM-ON-CHIP DESIGNS IN SPACE APPLICATIONS Session: Networks and Protocols Long Paper B. Osterloh, H. Michalik, B. Fiethe

More information

7a. System-on-chip design and prototyping platforms

7a. System-on-chip design and prototyping platforms 7a. System-on-chip design and prototyping platforms Labros Bisdounis, Ph.D. Department of Computer and Communication Engineering 1 What is System-on-Chip (SoC)? System-on-chip is an integrated circuit

More information

Switched Interconnect for System-on-a-Chip Designs

Switched Interconnect for System-on-a-Chip Designs witched Interconnect for ystem-on-a-chip Designs Abstract Daniel iklund and Dake Liu Dept. of Physics and Measurement Technology Linköping University -581 83 Linköping {danwi,dake}@ifm.liu.se ith the increased

More information

Design of a High Speed Communications Link Using Field Programmable Gate Arrays

Design of a High Speed Communications Link Using Field Programmable Gate Arrays Customer-Authored Application Note AC103 Design of a High Speed Communications Link Using Field Programmable Gate Arrays Amy Lovelace, Technical Staff Engineer Alcatel Network Systems Introduction A communication

More information

International Workshop on Field Programmable Logic and Applications, FPL '99

International Workshop on Field Programmable Logic and Applications, FPL '99 International Workshop on Field Programmable Logic and Applications, FPL '99 DRIVE: An Interpretive Simulation and Visualization Environment for Dynamically Reconægurable Systems? Kiran Bondalapati and

More information

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source)

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source) FPGA IMPLEMENTATION OF 4D-PARITY BASED DATA CODING TECHNIQUE Vijay Tawar 1, Rajani Gupta 2 1 Student, KNPCST, Hoshangabad Road, Misrod, Bhopal, Pin no.462047 2 Head of Department (EC), KNPCST, Hoshangabad

More information

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy Hardware Implementation of Improved Adaptive NoC Rer with Flit Flow History based Load Balancing Selection Strategy Parag Parandkar 1, Sumant Katiyal 2, Geetesh Kwatra 3 1,3 Research Scholar, School of

More information

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad A. M. Niknejad University of California, Berkeley EE 100 / 42 Lecture 24 p. 1/20 EE 42/100 Lecture 24: Latches and Flip Flops ELECTRONICS Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad University of California,

More information

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview Technical Note TN-29-06: NAND Flash Controller on Spartan-3 Overview Micron NAND Flash Controller via Xilinx Spartan -3 FPGA Overview As mobile product capabilities continue to expand, so does the demand

More information

White Paper FPGA Performance Benchmarking Methodology

White Paper FPGA Performance Benchmarking Methodology White Paper Introduction This paper presents a rigorous methodology for benchmarking the capabilities of an FPGA family. The goal of benchmarking is to compare the results for one FPGA family versus another

More information

Memory Systems. Static Random Access Memory (SRAM) Cell

Memory Systems. Static Random Access Memory (SRAM) Cell Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled

More information

International Journal of Advancements in Research & Technology, Volume 2, Issue3, March -2013 1 ISSN 2278-7763

International Journal of Advancements in Research & Technology, Volume 2, Issue3, March -2013 1 ISSN 2278-7763 International Journal of Advancements in Research & Technology, Volume 2, Issue3, March -2013 1 FPGA IMPLEMENTATION OF HARDWARE TASK MANAGEMENT STRATEGIES Assistant professor Sharan Kumar Electronics Department

More information

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems Harris Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems David Harris Harvey Mudd College [email protected] Based on EE271 developed by Mark Horowitz, Stanford University MAH

More information

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and

More information

PowerPC Microprocessor Clock Modes

PowerPC Microprocessor Clock Modes nc. Freescale Semiconductor AN1269 (Freescale Order Number) 1/96 Application Note PowerPC Microprocessor Clock Modes The PowerPC microprocessors offer customers numerous clocking options. An internal phase-lock

More information

Testing of Digital System-on- Chip (SoC)

Testing of Digital System-on- Chip (SoC) Testing of Digital System-on- Chip (SoC) 1 Outline of the Talk Introduction to system-on-chip (SoC) design Approaches to SoC design SoC test requirements and challenges Core test wrapper P1500 core test

More information

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead Clock - key to synchronous systems Topic 7 Clocking Strategies in VLSI Systems Peter Cheung Department of Electrical & Electronic Engineering Imperial College London Clocks help the design of FSM where

More information

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 s Introduction Convolution is one of the basic and most common operations in both analog and digital domain signal processing.

More information

Networking Virtualization Using FPGAs

Networking Virtualization Using FPGAs Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,

More information

SoC IP Interfaces and Infrastructure A Hybrid Approach

SoC IP Interfaces and Infrastructure A Hybrid Approach SoC IP Interfaces and Infrastructure A Hybrid Approach Cary Robins, Shannon Hill ChipWrights, Inc. ABSTRACT System-On-Chip (SoC) designs incorporate more and more Intellectual Property (IP) with each year.

More information

Rapid System Prototyping with FPGAs

Rapid System Prototyping with FPGAs Rapid System Prototyping with FPGAs By R.C. Coferand Benjamin F. Harding AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Newnes is an imprint of

More information

Kirchhoff Institute for Physics Heidelberg

Kirchhoff Institute for Physics Heidelberg Kirchhoff Institute for Physics Heidelberg Norbert Abel FPGA: (re-)configuration and embedded Linux 1 Linux Front-end electronics based on ADC and digital signal processing Slow control implemented as

More information

Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX

Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX White Paper Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX April 2010 Cy Hay Product Manager, Synopsys Introduction The most important trend

More information

Systolic Computing. Fundamentals

Systolic Computing. Fundamentals Systolic Computing Fundamentals Motivations for Systolic Processing PARALLEL ALGORITHMS WHICH MODEL OF COMPUTATION IS THE BETTER TO USE? HOW MUCH TIME WE EXPECT TO SAVE USING A PARALLEL ALGORITHM? HOW

More information

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components

More information

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE CHAPTER 5 71 FINITE STATE MACHINE FOR LOOKUP ENGINE 5.1 INTRODUCTION Finite State Machines (FSMs) are important components of digital systems. Therefore, techniques for area efficiency and fast implementation

More information

W a d i a D i g i t a l

W a d i a D i g i t a l Wadia Decoding Computer Overview A Definition What is a Decoding Computer? The Wadia Decoding Computer is a small form factor digital-to-analog converter with digital pre-amplifier capabilities. It is

More information

Operating Systems 4 th Class

Operating Systems 4 th Class Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science

More information

BUILD VERSUS BUY. Understanding the Total Cost of Embedded Design. www.ni.com/buildvsbuy

BUILD VERSUS BUY. Understanding the Total Cost of Embedded Design. www.ni.com/buildvsbuy BUILD VERSUS BUY Understanding the Total Cost of Embedded Design Table of Contents I. Introduction II. The Build Approach: Custom Design a. Hardware Design b. Software Design c. Manufacturing d. System

More information

Designing Real-Time and Embedded Systems with the COMET/UML method

Designing Real-Time and Embedded Systems with the COMET/UML method By Hassan Gomaa, Department of Information and Software Engineering, George Mason University. Designing Real-Time and Embedded Systems with the COMET/UML method Most object-oriented analysis and design

More information

Lecture 7: Clocking of VLSI Systems

Lecture 7: Clocking of VLSI Systems Lecture 7: Clocking of VLSI Systems MAH, AEN EE271 Lecture 7 1 Overview Reading Wolf 5.3 Two-Phase Clocking (good description) W&E 5.5.1, 5.5.2, 5.5.3, 5.5.4, 5.5.9, 5.5.10 - Clocking Note: The analysis

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill Objectives: Analyze the operation of sequential logic circuits. Understand the operation of digital counters.

More information

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1 Module 2 Embedded Processors and Memory Version 2 EE IIT, Kharagpur 1 Lesson 5 Memory-I Version 2 EE IIT, Kharagpur 2 Instructional Objectives After going through this lesson the student would Pre-Requisite

More information

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin BUS ARCHITECTURES Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin Keywords: Bus standards, PCI bus, ISA bus, Bus protocols, Serial Buses, USB, IEEE 1394

More information

Architectures and Platforms

Architectures and Platforms Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation

More information

Attaining EDF Task Scheduling with O(1) Time Complexity

Attaining EDF Task Scheduling with O(1) Time Complexity Attaining EDF Task Scheduling with O(1) Time Complexity Verber Domen University of Maribor, Faculty of Electrical Engineering and Computer Sciences, Maribor, Slovenia (e-mail: [email protected]) Abstract:

More information

IMPLEMENTATION OF FPGA CARD IN CONTENT FILTERING SOLUTIONS FOR SECURING COMPUTER NETWORKS. Received May 2010; accepted July 2010

IMPLEMENTATION OF FPGA CARD IN CONTENT FILTERING SOLUTIONS FOR SECURING COMPUTER NETWORKS. Received May 2010; accepted July 2010 ICIC Express Letters Part B: Applications ICIC International c 2010 ISSN 2185-2766 Volume 1, Number 1, September 2010 pp. 71 76 IMPLEMENTATION OF FPGA CARD IN CONTENT FILTERING SOLUTIONS FOR SECURING COMPUTER

More information

A Reconfigurable RTOS with HW/SW Co-scheduling for SOPC

A Reconfigurable RTOS with HW/SW Co-scheduling for SOPC A Reconfigurable RTOS with HW/SW Co-scheduling for SOPC Qingxu Deng, Shuisheng Wei, Hai Xu, Yu Han, Ge Yu Department of Computer Science and Engineering Northeastern University, China [email protected]

More information

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Bradley R. Quinton Dept. of Electrical and Computer Engineering University of British Columbia [email protected]

More information

2. TEACHING ENVIRONMENT AND MOTIVATION

2. TEACHING ENVIRONMENT AND MOTIVATION A WEB-BASED ENVIRONMENT PROVIDING REMOTE ACCESS TO FPGA PLATFORMS FOR TEACHING DIGITAL HARDWARE DESIGN Angel Fernández Herrero Ignacio Elguezábal Marisa López Vallejo Departamento de Ingeniería Electrónica,

More information

FPGA Clocking. Clock related issues: distribution generation (frequency synthesis) multiplexing run time programming domain crossing

FPGA Clocking. Clock related issues: distribution generation (frequency synthesis) multiplexing run time programming domain crossing FPGA Clocking Clock related issues: distribution generation (frequency synthesis) Deskew multiplexing run time programming domain crossing Clock related constraints 100 Clock Distribution Device split

More information

Delay Characterization in FPGA-based Reconfigurable Systems

Delay Characterization in FPGA-based Reconfigurable Systems Institute of Computer Architecture and Computer Engineering University of Stuttgart Pfaffenwaldring 47 D 70569 Stuttgart Master s Thesis Nr. 3505 Delay Characterization in FPGA-based Reconfigurable Systems

More information

PFP Technology White Paper

PFP Technology White Paper PFP Technology White Paper Summary PFP Cybersecurity solution is an intrusion detection solution based on observing tiny patterns on the processor power consumption. PFP is capable of detecting intrusions

More information

USB - FPGA MODULE (PRELIMINARY)

USB - FPGA MODULE (PRELIMINARY) DLP-HS-FPGA LEAD-FREE USB - FPGA MODULE (PRELIMINARY) APPLICATIONS: - Rapid Prototyping - Educational Tool - Industrial / Process Control - Data Acquisition / Processing - Embedded Processor FEATURES:

More information

Network Traffic Monitoring an architecture using associative processing.

Network Traffic Monitoring an architecture using associative processing. Network Traffic Monitoring an architecture using associative processing. Gerald Tripp Technical Report: 7-99 Computing Laboratory, University of Kent 1 st September 1999 Abstract This paper investigates

More information

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary Fault Modeling Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults Single stuck-at faults Fault equivalence Fault dominance and checkpoint theorem Classes of stuck-at

More information

4. H.323 Components. VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19

4. H.323 Components. VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19 4. H.323 Components VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19 4.1 H.323 Terminals (1/2)...3 4.1 H.323 Terminals (2/2)...4 4.1.1 The software IP phone (1/2)...5 4.1.1 The software

More information

Verification of Triple Modular Redundancy (TMR) Insertion for Reliable and Trusted Systems

Verification of Triple Modular Redundancy (TMR) Insertion for Reliable and Trusted Systems Verification of Triple Modular Redundancy (TMR) Insertion for Reliable and Trusted Systems Melanie Berg 1, Kenneth LaBel 2 1.AS&D in support of NASA/GSFC [email protected] 2. NASA/GSFC [email protected]

More information

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology Topics of Chapter 5 Sequential Machines Memory elements Memory elements. Basics of sequential machines. Clocking issues. Two-phase clocking. Testing of combinational (Chapter 4) and sequential (Chapter

More information

Header Parsing Logic in Network Switches Using Fine and Coarse-Grained Dynamic Reconfiguration Strategies

Header Parsing Logic in Network Switches Using Fine and Coarse-Grained Dynamic Reconfiguration Strategies Header Parsing Logic in Network Switches Using Fine and Coarse-Grained Dynamic Reconfiguration Strategies by Alexander Sonek A thesis presented to the University of Waterloo in fulfillment of the thesis

More information

Implementation and Design of AES S-Box on FPGA

Implementation and Design of AES S-Box on FPGA International Journal of Research in Engineering and Science (IJRES) ISSN (Online): 232-9364, ISSN (Print): 232-9356 Volume 3 Issue ǁ Jan. 25 ǁ PP.9-4 Implementation and Design of AES S-Box on FPGA Chandrasekhar

More information

Computer Architecture

Computer Architecture Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 11 Memory Management Computer Architecture Part 11 page 1 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin

More information

Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation

Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation Florent Berthelot, Fabienne Nouvel, Dominique Houzet To cite this version: Florent Berthelot,

More information

A General Framework for Tracking Objects in a Multi-Camera Environment

A General Framework for Tracking Objects in a Multi-Camera Environment A General Framework for Tracking Objects in a Multi-Camera Environment Karlene Nguyen, Gavin Yeung, Soheil Ghiasi, Majid Sarrafzadeh {karlene, gavin, soheil, majid}@cs.ucla.edu Abstract We present a framework

More information

Power Reduction Techniques in the SoC Clock Network. Clock Power

Power Reduction Techniques in the SoC Clock Network. Clock Power Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a

More information

8 Gbps CMOS interface for parallel fiber-optic interconnects

8 Gbps CMOS interface for parallel fiber-optic interconnects 8 Gbps CMOS interface for parallel fiberoptic interconnects Barton Sano, Bindu Madhavan and A. F. J. Levi Department of Electrical Engineering University of Southern California Los Angeles, California

More information

How To Design An Image Processing System On A Chip

How To Design An Image Processing System On A Chip RAPID PROTOTYPING PLATFORM FOR RECONFIGURABLE IMAGE PROCESSING B.Kovář 1, J. Kloub 1, J. Schier 1, A. Heřmánek 1, P. Zemčík 2, A. Herout 2 (1) Institute of Information Theory and Automation Academy of

More information

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001 Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

Source-Synchronous Serialization and Deserialization (up to 1050 Mb/s) Author: NIck Sawyer

Source-Synchronous Serialization and Deserialization (up to 1050 Mb/s) Author: NIck Sawyer Application Note: Spartan-6 FPGAs XAPP1064 (v1.2) November 19, 2013 Source-Synchronous Serialization and Deserialization (up to 1050 Mb/s) Author: NIck Sawyer Summary Spartan -6 devices contain input SerDes

More information

DDR subsystem: Enhancing System Reliability and Yield

DDR subsystem: Enhancing System Reliability and Yield DDR subsystem: Enhancing System Reliability and Yield Agenda Evolution of DDR SDRAM standards What is the variation problem? How DRAM standards tackle system variability What problems have been adequately

More information

Implementation of Web-Server Using Altera DE2-70 FPGA Development Kit

Implementation of Web-Server Using Altera DE2-70 FPGA Development Kit 1 Implementation of Web-Server Using Altera DE2-70 FPGA Development Kit A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENT OF FOR THE DEGREE IN Bachelor of Technology In Electronics and Communication

More information

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path Project Summary This project involves the schematic and layout design of an 8-bit microprocessor data

More information

High-Level Synthesis for FPGA Designs

High-Level Synthesis for FPGA Designs High-Level Synthesis for FPGA Designs BRINGING BRINGING YOU YOU THE THE NEXT NEXT LEVEL LEVEL IN IN EMBEDDED EMBEDDED DEVELOPMENT DEVELOPMENT Frank de Bont Trainer consultant Cereslaan 10b 5384 VT Heesch

More information

Modeling Latches and Flip-flops

Modeling Latches and Flip-flops Lab Workbook Introduction Sequential circuits are digital circuits in which the output depends not only on the present input (like combinatorial circuits), but also on the past sequence of inputs. In effect,

More information

Introduction to Embedded Systems. Software Update Problem

Introduction to Embedded Systems. Software Update Problem Introduction to Embedded Systems CS/ECE 6780/5780 Al Davis logistics minor Today s topics: more software development issues 1 CS 5780 Software Update Problem Lab machines work let us know if they don t

More information

Programmable Logic Design Grzegorz Budzyń Lecture. 10: FPGA clocking schemes

Programmable Logic Design Grzegorz Budzyń Lecture. 10: FPGA clocking schemes Programmable Logic Design Grzegorz Budzyń Lecture 10: FPGA clocking schemes Plan Introduction Definitions Clockskew Metastability FPGA clocking resources DCM PLL Introduction One of the most important

More information

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah (DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de [email protected] NIOS II 1 1 What is Nios II? Altera s Second Generation

More information

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: Embedded Systems - , Raj Kamal, Publs.: McGraw-Hill Education Lesson 7: SYSTEM-ON ON-CHIP (SoC( SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY 1 VLSI chip Integration of high-level components Possess gate-level sophistication in circuits above that of the counter,

More information

Floating Point Fused Add-Subtract and Fused Dot-Product Units

Floating Point Fused Add-Subtract and Fused Dot-Product Units Floating Point Fused Add-Subtract and Fused Dot-Product Units S. Kishor [1], S. P. Prakash [2] PG Scholar (VLSI DESIGN), Department of ECE Bannari Amman Institute of Technology, Sathyamangalam, Tamil Nadu,

More information

HARDWARE ACCELERATION IN FINANCIAL MARKETS. A step change in speed

HARDWARE ACCELERATION IN FINANCIAL MARKETS. A step change in speed HARDWARE ACCELERATION IN FINANCIAL MARKETS A step change in speed NAME OF REPORT SECTION 3 HARDWARE ACCELERATION IN FINANCIAL MARKETS A step change in speed Faster is more profitable in the front office

More information

Design Verification & Testing Design for Testability and Scan

Design Verification & Testing Design for Testability and Scan Overview esign for testability (FT) makes it possible to: Assure the detection of all faults in a circuit Reduce the cost and time associated with test development Reduce the execution time of performing

More information

CHAPTER 3 Boolean Algebra and Digital Logic

CHAPTER 3 Boolean Algebra and Digital Logic CHAPTER 3 Boolean Algebra and Digital Logic 3.1 Introduction 121 3.2 Boolean Algebra 122 3.2.1 Boolean Expressions 123 3.2.2 Boolean Identities 124 3.2.3 Simplification of Boolean Expressions 126 3.2.4

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 ISSN 0976 6464(Print)

More information

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ What is NAND Flash? What is the major difference between NAND Flash and other Memory? Structural differences between NAND Flash and NOR Flash What does NAND Flash controller do? How to send command to

More information

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers Bogdan Mătăsaru and Tudor Jebelean RISC-Linz, A 4040 Linz, Austria email: [email protected]

More information

Accelerate Cloud Computing with the Xilinx Zynq SoC

Accelerate Cloud Computing with the Xilinx Zynq SoC X C E L L E N C E I N N E W A P P L I C AT I O N S Accelerate Cloud Computing with the Xilinx Zynq SoC A novel reconfigurable hardware accelerator speeds the processing of applications based on the MapReduce

More information

Arquitectura Virtex. Delay-Locked Loop (DLL)

Arquitectura Virtex. Delay-Locked Loop (DLL) Arquitectura Virtex Compuesta de dos elementos principales configurables : CLBs y IOBs. Los CLBs se interconectan a través de una matriz general de routeado (GRM). Posse una intefaz VersaRing que proporciona

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operatin g Systems: Internals and Design Principle s Chapter 11 I/O Management and Disk Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles An artifact can

More information

Command Processor for MPSSE and MCU Host Bus Emulation Modes

Command Processor for MPSSE and MCU Host Bus Emulation Modes Future Technology Devices International Ltd. Application Note AN_108 Command Processor for MPSSE and MCU Host Bus Emulation Modes Document Reference No.: FT_000109 Version 1.5 Issue Date: 2011-09-09 This

More information

Test Driven Development of Embedded Systems Using Existing Software Test Infrastructure

Test Driven Development of Embedded Systems Using Existing Software Test Infrastructure Test Driven Development of Embedded Systems Using Existing Software Test Infrastructure Micah Dowty University of Colorado at Boulder [email protected] March 26, 2004 Abstract Traditional software development

More information

Multiplexers Two Types + Verilog

Multiplexers Two Types + Verilog Multiplexers Two Types + Verilog ENEE 245: Digital Circuits and ystems Laboratory Lab 7 Objectives The objectives of this laboratory are the following: To become familiar with continuous ments and procedural

More information