Test compression bandwidth management in system-on-a-chip designs

Size: px
Start display at page:

Download "Test compression bandwidth management in system-on-a-chip designs"

Transcription

1 Politechnika Poznańska Wydział Elektroniki i Telekomunikacji Poznań University of Technology Faculty of Electronics and Telecommunications Ph. D. Thesis Test compression bandwidth management in system-on-a-chip designs by Jakub Janicki Supervisors: prof. dr inż. Janusz Rajski prof. dr hab. inż. Jerzy Tyszer Poznań, Poland 2013

2

3 To my dearest wife Natalia

4

5 Acknowledgements The work presented here was carried out between July 2009 and October 2013 at the Faculty of Electronics and Telecommunications at Poznań University of Technology and in the Mentor Graphics Polska. Numerous people helped me through this effort and I would like to mention some of them here. I would like to express my sincere gratitude to my supervisor Professor Jerzy Tyszer. His guidance, enthusiasm, inspiration, insightful remarks, and, last but not least, continual motivation were all invaluable. I will never forget the atmosphere of that time spent on research work. I also wish to thank my co-supervisor Professor Janusz Rajski from Mentor Graphics Corporation for his vision, technological review, and encouragement. I extend my thanks to all people from Mentor Graphics who helped me during my work toward the PhD. Especially, I would like to express my gratitude to Dr. Grzegorz Mrugalski for insightful discussions, cooperation and for what I have learned from them. I would also like to thank Dr. Nilanjan Mukherjee and Dr. Yu Huang for their help in preparing experiments and valuable discussions we have during technology transfer process. I gratefully acknowledge the scholarships from Mentor Graphics and Semiconductor Research Corporation that allowed me to attend many international conferences, summer internships and join the test community. Finally, my deepest thanks, love, and affection go to my wife Natalia, my family, and all my tutors. For their love, patience, and support I am forever grateful. Poznań, in October 2013 v

6

7 Contents Contents List of Symbols and Abbreviations List of Figures List of Tables vii xi xiii xv 1 Introduction 1 2 State of the art Testing in scan-based designs Test data compression System-on-a-chip testing Test architectures Test scheduling Test sharing and broadcasting Test architectures and scheduling co-optimization Test architectures with test data compression Motivation Thesis overview Bandwidth-aware test compression in EDT environment Specified bits in test cubes Solver with channel-aware pivoting Channel underutilization Test data circulation vii

8 viii Contents 3.5 Channel selection order Bandwidth-aware compression Experimental results Bandwidth-aware test response compaction Output data selector architectures Basic output data selector XOR tree-based output data selector XOR tree-based output data selector with X-filtering Selection of observation points Experimental results Bandwidth management in EDT environment General SoC test environment Core level statistics Test pattern set descriptor Bandwidth management Two-stage test scheduling algorithm Test access mechanism architecture Generic approach Test access mechanism architecture Test scheduling algorithm Experimental results Test time reduction in SoC environment Single core test time reduction Impact on test data volume Test time reduction scheme Minimization of control configurations Conditional merging Changing application order Experimental results Test compression bandwidth management - practical scenarios Test flow Control data delivery The use of IJTAG Dedicated control chain

9 Contents ix Pipeline architecture Optimization of SoC pin allocation Handling physical constraints Experimental results Conclusion 105 Bibliography 109

10

11 List of Symbols and Abbreviations Abbreviation ACM ATE ATPG ATS BIST CAD CUT DAC DATE DFT EDT ETS FSM IC ICCAD Description Association for Computing Machinery Automatic Test Equipment Automatic Test Pattern Generation Asian Test Symposium Built-In Self-Test Computer-Aided Design Circuit Under Test Design Automation Conference Design, Automation & Test in Europe Design For Testability Embedded Deterministic Test European Test Symposium Finite State Machine Integrated Circuit International Conference on Computer Aided Design xi

12 xii List of Symbols and Abbreviations Abbreviation ICCD IEEE IP ITC LFSR MISR PRPG SiP SoC STUMPS TPG VLSI VTS Description International Conference on Computer Design Institute of Electrical and Electronics Engineers Intellectual Property International Test Conference Linear Feedback Shift Register Multiple-Input Signature Register Pseudo Random Pattern Generator System-in-a-Package System-on-a-Chip Self-Testing Using MISR and Parallel shift register Sequence generator Test Pattern Generator Very Large Scale Integration VLSI Test Symposium

13 List of Figures 2.1 Basic scan-design architecture The BIST architecture The STUMPS architecture Combinational decompressor Illinois scan Sequential decompressor EDT architecture Example of 10-output 16-bit EDT decompressor TestShell Test access mechanism architectures Test scheduling techniques Relationship between TAT and TAM width Test-architecture with test-data compression Structure of the thesis Test cube fill rate profile Test cube fill rate profile and channel demands Encoding efficiency Single core channel underutilization Variable propagation (circulation) Conventional EDT injector placement Distribution of variables injected through different channels Test cube channel demands Evenly distributed EDT injectors Channel profiles for three EDT setups xiii

14 xiv List of Figures 4.1 Output data selector architectures Fault observation on XOR tree-based ODS Percentage of test patterns with different numbers of output channels General SoC test environment Core-based channel utilization for C Base clusters and test schedules for the best-fit algorithm TAM input interconnection network Examples of channel assignment Simple interconnection networks Bipartite graphs for the networks of Figure ATE channels vs. test application time for design D Best-fit-based test schedule for design D1 with 28 ATE input channels Balanced-fit-based test schedule for design D1 with 41 ATE input channels Test schedule for design D3 with 19 ATE input channels Test schedule for design D4 with 20 ATE input channels Pattern count for the industrial design Shift cycles for the industrial design Test data volume vs. the number of input channels Conditional merging Changing application order ATE channel reduction vs. test time reduction and test compression Using IJTAG network to transfer control data Dedicated control chain-based architecture Pipeline architecture Test data volume for industrial design Pin layout constraints The baseline bin-packing diagrams Test time reduction The optimized test schedules

15 List of Tables 3.1 Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Experimental results - circuit D Correlated cores statistics Single core wire network assignment SoC characteristics I Experimental results - non-isolated cores Experimental results - isolated cores SoC characteristics II Experimental results bit long scan chains SoC characteristics III SoC characteristics IV Experimental results xv

16

17 Chapter 1 Introduction Intensive technological progress in semiconductor fabrication has facilitated shrinking chip features below 50 nanometers and moved toward three-dimensional integrated circuits. In consequence, we can observe an unparalleled growth in gate counts and in circuits operational frequencies. It permits the Moore s law to remain still relevant with transistor count in a typical design doubling every two years. Contemporary circuit growth in size forced division of design project into independent functional parts. Such approach has had a profound impact on the design process and forms the basis for producing modular system-on-a-chip (SoC) and system-in-a-package (SiP) circuits. These designs can include a variety of digital, analog, mixed-signal, memory, optical, micro-electro-mechanical, and radiofrequency cores. Most often individual cores are delivered by various vendors as license driven IP cores embracing reusable unit of logic, cell, or chip layout designs. Popularity of SoC circuits has led to an unprecedented increase in the cost of test which became a serious challenge. This rise is primarily attributed to the difficulty in accessing embedded cores during test, long test development and test application times, and large volumes of test data. Application of new materials connected with new fabrication processes reduces the cost of a single transistor, however, at the same time introduces new types of manufacturing defects and changes the distribution of traditional failures. Increases in test data volume drive up the cost of testing by elevating both test application time and tester storage. These aspects decide about the cost-effective density of transistors on a chip. This is why electronics industry expects new test solutions to enable an acceptable ratio of test cost to production cost while maintaining high test quality. Both test application time and test data volume can be reduced by compression of 1

18 2 Chapter 1. Introduction test stimuli and compaction of test responses. However, diversity of test requirements of design with a range of core counts has led to ineffective usage of available tester channels. Fortunately, dynamic channel allocation precisely allocating resources for every single test allows effective usage of assigned channels. Whereas relatively high compression ratios can be obtained for a single core, the overall advantage can be achieved by sharing of all available channel resources between design cores. As a result, a new bandwidth management scheme has been introduced. The primary objective of the thesis is to introduce new methods for dynamic channel allocation of Embedded Deterministic Test (EDT) decompressor and propose new bandwidth management schemes for system-on-a-chip testing. The thesis provides a comprehensive presentation and analysis of algorithms developed by the author and described earlier in multiple conference papers [48], [45], [52], [51], as well as journal publications [49], [50]. The presented solutions are also covered by pending patent applications [47], [46]. The thesis is organized as follows. The next chapter discusses the most important aspects of test data compression, test compaction and system-on-a-chip testing. Chapter 2 provides also a brief review of state-of-the-art system-on-a-chip testing methods presented in the open technical literature. The motivation for this work is also included at the end of this chapter. Chapter 3 presents new bandwidth-aware decompression of test data vectors in an EDT environment. In Chapter 4, a new scheme to handle test responses in a system-on-a-chip is presented. A new architecture for bandwidth managment and a scheduling algorithm are presented in Chapter 5. Chapter 6 presents an original method for test time reduction in system-on-a-chip testing. Application and practical scenarios of the introduced methods are presented in Chapter 7. Finally, Chapter 8 concludes the thesis.

19 Chapter 2 State of the art 2.1 Testing in scan-based designs Long-standing research in a very large scale integration design (VLSI) prototyping permitted formulation a set of rules that, accomplished, guarantee expected test quality. The main paradigms of design for testability (DFT) methodology assume that a circuit under test (CUT) will be controllable, observable, and predictable. Controllability quantifies a capability to initiate every local internal state in a circuit using only externally available circuit s inputs. This is complemented by observability that allows one to determine every state in a circuit by ensuring propagation of the signal to the circuit s outputs. Finally, predictability denotes that there is a feasibility to determine a state of a circuit in response to given input stimuli. All these abilities are exploited by Automatic Test Pattern Generator (ATPG) to deliberately control signal propagation in the design structure and generate test stimuli assuring testability (controllability and observability of targeted faults). The growing complexity of designs and increase of sequential depth were a reason that ad hoc techniques (test points insertion, disabling internally generated clocks, removing of logical redundancy and global feedback paths, isolating of memory arrays and embedded structures, partitioning of large circuits) guaranteeing the high controllability and observability of simple circuits became insufficient. Difficulties in testing of large circuits stimulated a search for techniques making state variables directly controllable and observable. The most crucial milestone was a scan design concept. It introduces a special test mode into circuit in such a way that virtually all memory elements during testing form shift registers (scan paths). A basic single scan-path design, illustrated in Figure 3

20 4 Chapter 2. State of the art 2.1, assumes that the circuit features three extra pins (test mode, scan-in, scan-out) and additional multiplexers. A CUT prepared in this manner can work in two modes of operation. In normal mode, all memory elements perform their regular functions, whereas in test mode - test patterns are shifted in through the scan-in terminal, and test responses are subsequently shifted out through the scan-out pin. Indeed, this approach became a universal rule in VLSI design process because such testing of circuits can be treated as testing of multiple simple cones of combinational logic. As a result, scan-based DFT significantly simplifies test pattern generation. On the other hand, it introduces several limitations such as: area overhead, performance degradation due to the presence of multiplexers, and increased test application time. They were, however, accepted in testing process by the semiconductor industry as the only way to assure high production efficacy. Primary inputs Combinational circuit Primary outputs Scan-in D Q D Q D Q Scan-out Test mode Clock Clk Clk Clk Figure 2.1: Basic scan-design architecture One of the well-established approaches that utilizes scan-based DFT is built-in selftest (BIST) allowing CUT to test itself. It is feasible because a typical BIST architecture, presented in Figure 2.2, contains both apparatus for test pattern generation and test response compaction. BIST controller that communicates with a tester using only a few tester pins is in charge of test validation. An application of test vectors and capture of test responses may proceed in parallel for multiple scans in every clock cycle or serially through a single scan path. In particular, significant reduction of test application time is possible by using parallel shift register sequence generator (STUMPS) and self-testing using multiple-input signature register (MISR), as shown in Figure 2.3. It is worth noting that the quality of the applied test pattern generator and test response compactor has a considerable impact on test time and fault coverage. Hence, pseudo-random test generator (PRPG) should provide a pseudo-random test stimuli with an appropriate level of

21 2.1. Testing in scan-based designs 5 Test pattern generator Scan register(s) BIST Control logic Reference CUT Scan register(s) Compactor Scan register(s) Figure 2.2: The BIST architecture randomness, while a test response compactor should minimize the probability of masking an error (aliasing). Nevertheless, manufacturing test may require additional deterministic tests, delivered by ATE (Automatic Test Equipment) or stored on a chip, to achieve the desired level of fault coverage. This, in turn, requires efficient test compression to reduce the volume of stored test data. Pseudo random test pattern generator... Input boundary sacn... Scan path Scan path... CUT Scan path Scan path Multiple-input shift register (MISR) Output boundary scan Figure 2.3: The STUMPS architecture

22 6 Chapter 2. State of the art 2.2 Test data compression In the last decade, test-data compression has become an efficient technique to reduce both test data volume and test application time [91]. Many of proposed schemes exploit the regularities and presence of high number of unspecified bits in test patterns. A low fill rate of care bits is a consequence of a shallow combinational logic [95] in the contemporary designs developed to work with increasing clock frequencies. As a result, the fill rate of the highly specified test patterns do not exceed 3-5% [78], [29]. In the general scheme, compressed test stimuli, stored in the tester memory, are delivered through tester channels to an on-chip decompressor which restores the expected data and loads them into scan chains. We can distinguish four major groups [91] of test data compression schemes based on a test pattern encoding technique. Combinational decompressors The first group - combinational decompressors - are based on linear operations and mostly utilize linear XOR networks [4], [5], [6], [58], [69]. Figure 2.4 shows a general scheme of four input combinational decompressor feeding eight scan chains. In spite of the number of input channels being lower than the number of scan chains, each input channel drives several scan chains. The lack of memory elements in a such structure causes the current input channel value to be immediately injected into the scan chains. Techniques based on such a principle may provide satisfying results only if test patterns contain uniformlydistributed care bits in every scan slice. However, experimental cases show that often care bits are clustered. In addition, the internal structure of CUT causes that it is not always possible to change their placement. As a result, additional input channels are required to handle high specified scan slices, and thus the ability of this approach to significantly reduce the volume of test data becomes limited. The simplest subclass of combinational decompressors broadcasts an identical test vector to several scan chains, as presented in [64]. Illinois scan [26], [31], based on that scheme, is presented in Figure 2.5. Test data injected to the adjacent scan chains are fed from the same input channel. ATPG, however, may require that the corresponding scan cells cannot contain the same - conflicted values. Thus it takes into consideration these constraints or, if it is not possible to assure an appropriate test coverage due to these limitations, concatenated scan chains are filled in directly with no compression. Hence, the resultant compression depends on the ratio between the number of test patterns delivered in the broadcasted and serial manner. There are multiple techniques proposed to solve the problem with conflicted values in scan chains driven by the same channel. All of them require either dynamic and static

23 2.2. Test data compression 7 Figure 2.4: Combinational decompressor reconfiguration of scan chains either in space (different scan chains connected in different sessions to the same channel) or in time (different groups of scan chains) or mix of these two. Figure 2.5: Illinois scan [26] Code-based schemes Many of the proposed test compression techniques utilize well-known data compression algorithms developed earlier within the information theory. They reduce the statistical redundancy in test data - a result of structural dependency between faults in the circuit - to represent stimuli more concisely. The most popular are code-based schemes that replace repeating parts (symbols) in test data by code words. An on-chip decompression

24 8 Chapter 2. State of the art process restores the original data covering of code words back into corresponding symbols. Among such schemes we can distinguish approaches that exploit run-length [10], [73], [37], [36], [54], statistical [73], [37], [36], [53], constructive [96], Golomb [9], and dictionary [66], [81], [88], [99] codes. A similar concept is used in packet-based encoding [93] and ninecoded compression [74], [89]. All those techniques require an on-chip decoder storing the code words. As a result, the hardware synthesized based on test data characteristics may result in a very complex structure. Static reseeding Diverse distributions of care bits in the scan slices decides that the effectiveness of combinational decompressors is limited. To make test cubes easier to encode, scan slices which have a high number of care bits will be able to utilize some variables from an adjacent slices which contain fewer care bits. In order to enable such operation, a decompressor must contain some memory elements. Indeed, any type of generic sequential circuitry may be used as the decompressor. The linear feedback shift register (LFSR) coding was originally proposed in [56] (see Figure 2.6) and improved in a number of approaches [28], [57], [60], [80], [2]. The compressed test-data in the form of initial seed for the LFSR is loaded at the beginning of every test pattern application. The LFSR, in subsequent clock cycles, move from one state to the next one while producing a pseudo-random sequence which matches the orginal test vector in all the care bit positions. A proof presented in [56] formulates a principle: to encode S specified bits with a probability higher than , the LFSR needs at least S + 20 sequential elements. Hence, the maximum number of care bits specified in a test cube defines the minimal allowed size of LFSR. Several enhancements were proposed [94], [59], [98] avoiding this hard to fulfill in practice limitation. These schemes allow one to utilize reduced size LFSRs assuming that only selected scan slices per test pattern are encoded. Dynamic reseeding Loading of all the variables required to encode care bits in a test pattern can be replaced by continuously providing variables to the LFSR. This compression paradigm, called dynamic reseeding, was proposed for the EDT [78]. It delivers a few variables in every shift-in cycle instead of all in a single load. As a result the size of decompressor does not depend on the total number of specified bits and the size of LFSR may be reduced. Such a scheme allowed one to achieve unprecedented test data compression and test time reduction. Moreover, it was coupled with the standard scan and ATPG methodology guaranteeing a very simple flow and wide applicability. A general scheme of using an EDT decompressor is presented

25 2.2. Test data compression 9 Figure 2.6: Sequential decompressor Compactor Phase shifter Ring generator Compressed stimuli ATE Compressed responses Figure 2.7: EDT architecture [78] in Figure 2.7. It consists of an r-bit ring generator [70] and an associated phase shifter driving s scan chains. Compressed test cubes are delivered to the decompressor through c external channels in a continuous manner, i.e., a new c-bit vector is injected into the ring generator every scan shift cycle, effectively moving this linear finite state machine from one state to another. For example, consider a decompressor and a corresponding phase shifter shown in Figure 2.8. The decompressor consists of an 16-bit ring generator implementing primitive polynomial x 16 + x 10 + x 7 + x The 10-output phase shifter is comprised of 3-input XOR gates connected to the outputs of memory elements as follows: 1 (9, 6, 8), 2 (13, 1, 4), 3 (3, 15, 10), 4 (14, 7, 0), 5 (2, 12, 5), 6 (9, 1, 12), 7 (14, 7, 6), 8 (15, 0, 8), 9 (11, 13, 10), 10 (11, 2, 3). The input variables are

26 10 Chapter 2. State of the art 1 (9,6,8), 2 (13,1,4), 3 (3,15,10), 4 (14,7,0), 5 (2,12,5), 6 (9,1,12), 7 (14,7,6), 8 (15,0,8), 9 (11, 13, 10), 10 (11, 2, 3) provided in pairs through four external input channels connected to the following inputs of ring generator stages: (9, 7), (11, 4), (14, 2), and (15, 1). I 0 I 1 I 2 I Figure 2.8: Example of 10-output 16-bit EDT decompressor Further enhancements of this scheme were proposed. For example, the low power decompression schemes [13], [16], [17], [18] allow one to considerably reduce switching activity. X-masking schemes introduced in [79], [14] offer test compression exceeding an encoding limit one variable per one care bit and eliminates all unknown states from test responses. 2.3 System-on-a-chip testing With on-chip test compression becoming the production test standard, its application in SoC designs requires additional infrastructure to transport test data between the SoC pins and the embedded cores. The industry is currently witnessing a major change on how data is transferred between different parts of an electronic system. With data rates exceeding 1 Gb/s, parallel I/O schemes are being replaced by high-speed serial links. This is driven by a need to meet new bandwidth requirements and simplify designs. However, as growth in high-speed I/O implies less digital pins, it becomes imperative to run SoC tests in a reduced pin count test environment. Moreover, cost-effective SoC test requires

27 2.3. System-on-a-chip testing 11 scheduling. Unfortunately, even the simplest existing test scheduling algorithms are time consuming. Indeed, there are many solutions that are milestones in an SoC testing. To show their variety, previous work addressing test data transportation, test data scheduling, and optimization techniques that are directly related to this thesis, is presented here Test architectures Application of test stimuli in SoC designs requires specialized on-chip test architectures. In this section, the related work on test-architecture design, including wrapper and TAM design, is presented. Test wrappers A test wrapper forms an interface between a core and an SoC environment. This specialized instrumentation is responsible for a core isolation and a test access available in special test modes. We can distinguish the following two complementary parts of a design process: wrapper architecture selection and wrapper optimization. Several wrapper architectures have been proposed. TestShell [67] and Test Collar [92] form the basis of the standardized wrapper architecture IEEE 1500 [19], [90]. The conceptual scheme of TestShell wrapper is presented in Figure 2.9. TestShell wrapper architecture exploits a dedicated multiplexer per each functional input and output. The input multiplexer controls application of test stimuli and functional data, whereas the output multiplexer is responsible for interconnect test and observation of produced responses. TestShell provides four control modes. A normal core operation is available in a functional mode, core is under test in a test mode, interconnect test mode allows test of a glue logic between cores while a bypass mode assures transparent data transfer through the TestShell wrapper. Wrapper design optimization is aimed at clustering of scanable elements to minimize the length of the longest wrapper chain tailored towards optimal test application time [22], [42], [100]. Several approaches complying with this demand have been presented in [68], [41]. The design wrapper algorithm proposed in [41] sorts descending scan chains according to their length. Next, it assigns each scan chain to a wrapper chain in this manner that the length of the formed chain is closest to but cannot exceed the length of the current longest wrapper chain. If no appropriate wrapper chain can be found, the current scan chain is assigned to the wrapper chain with the shortest length. Finally, wrapper cells are assigned to the created wrapper chains.

28 12 Chapter 2. State of the art Test control TestShell Functional data Test stimuli Interconnect test stimuli Core Functional data Test responses Interconnect test responses Figure 2.9: TestShell [67] Test access mechanism design Test access mechanism (TAM) is a bidirectional communication architecture used to transport test stimuli from the SoC pins to the embedded cores and test responses from the embedded cores to the SoC pins. Several TAM architectures have been proposed [1], [27], [38], [40], [67], [92]. Based on access properties to an embedded core, TAM architectures can be divided into functional and dedicated groups. The functional TAM takes advantage of well-communicated native SoC bus infrastructure that may be used during test process. This is why it reuses the operating connections in the SOC as TAM to reduce the number of additional wires required to develop a communication test infrastructure. An extended AMBA specification [3] that contains an interface controller and a test harness mimicking a test wrapper allows, behind normal core communication, also test-data transfer. A Reuse of Addressable System Bus for SOC Testings (RASBuS) [35] scheme exploits on-chip microprocessors to test the cores and functional bus infrastructure for the test transportation. Application of these schemes is justified as long as they offer sufficient resources for test while hardware overhead to adjust a SoC infrastructure is negligible. The second group of TAMs make dedicated schemes that are more popular due to easier integration within SoC framework and provide a used test technique with customized resources. Direct access test scheme [38] (see Figure 2.10a) replaces both the wrapper and TAM infrastructures because it creates a network of direct links between SoC pins and cores. This simple technique guarantees the shortest possible test application time. Indeed, such approach is not applicable in practice because the cutting-edge SoC designs require for test considerably more channels than may be allocated. Moreover, the total number of additional wires corresponds to the total number of SoC pins and results in a large wiring overhead.

29 2.3. System-on-a-chip testing 13 SoC SoC WTAM SoC Core 1 Core 2 WTAM Core 1 Core 2 WTAM WTAM WTAM WTAM Core 1 Core 2 BC1 BC2 BC3 BC4 Bypass for Core 1 WTAM Core 4 Core 3 Core 4 Core 3 WTAM Core 4 Core 3 (a) Direct access (b) Multiplexing (c) Daisychain SoC SoC SoC Test bus 1 Core 1 Core 2 Core 1 Core 2 WTAM Core 1 Core 2 Test bus 2 WTAM Core 4 Core 3 Core 4 Core 3 Core 4 Core 3 (d) Distributed (e) Test bus (f) Flexible-width WTAM SoC WTAM Core 1 Core 2 WTAM WTAM Core 4 Core 3 (g) TestRail Figure 2.10: Test access mechanism architectures A three elementary architectures that solve the problem of limited number of I/O pins were proposed in [1]. The multiplexing architecture presented in Figure 2.10b consists of a single TAM connecting all cores. It assures an access to one core at a time and for this reason sequential testing of all cores. As a result, the total test time corresponds to the sum of test times of all individual cores. The Daisychain architecture extends the functionality of the multiplexing architecture by allowing testing of multiple cores at the same time. As presented in Figure 2.10c, all core wrapper chains are connected through a bypass structure into long chains. In the distributed architecture (Figure 2.10d), each core has its own dedicated TAM, and all cores are tested in parallel. The sum of each TAM s width is the full TAM width of the system. The overall test application time for the system is given by the core with the longest test application time. These three dedicated TAM architectures solve the TAM problem, however, they do not provide the ability to reduce test application time using more flexible test scheduling.

30 14 Chapter 2. State of the art TAM architectures that support new features in test scheduling have been proposed in [92], [67], and [40]. The Test bus proposed in [92] and illustrated in Figure 2.10e can be seen as a combination of the multiplexing and distributed architectures. The TestRail architecture presented in [67] (see Figure 2.10g) is, in turn, a combination of the Daisychain and the distributed architectures. The Test bus and TestRail architectures support more flexible scheduling alternatives since the total number of TAM wires can be partitioned into several TAM channel subsets. SoC testing using such architectures is, however, limited due to the cores assigned to a TAM that are connected to all wires of that TAM and a single tested core consumes the entire bandwidth of this TAM part. A flexible-width architecture, proposed in [40], allows cores to be connected in a flexible way to the TAM wires, as illustrated in Figure 2.10f. In consequence, each TAM wire is treated as a separate unit which increases the flexibility of a test scheduler. The customized architecture, however, potentially leads to an irregular organization of the test-data in the tester memory, and thus additional test control may be required Test scheduling Testing of SoC designs must consider many key factors. The most important are: common test resources with a limited access, complex controlling of wrapper and TAM architectures, bandwidth demands exceeding available resources and power dissipation. All these constraints and aims of optimization necessitate systematic methods to assure costeffective usage of the available ATE bandwidth. Test scheduling minimizes a defined fitness function considering all imposed limitations. Channel resources, test application time and amount of heat dissipated during the test procedure constitute the main targets of optimization. As a result, it delivers a test agenda for each core determining time slots associated with assigned test resources and satisfying the given constraints. A number of various algorithms have been proposed in the literature [25]. Test scheduling in SoC testing exploits such well-know techniques as bin packing (BP) [32], [33], integer linear programming (ILP) [11], [72], mixed integer linear programming (MILP) [7], simulated annealing (SA) [103], tabu search [87], etc. Conditions that decide when available resources may be freed by a single task and allocated to another one are the most important criteria of scheduling algorithm division. Based on that, scheduling techniques can be divided into the following three main categories: non-partitioned, partitioned and pre-emptive testing. Examples illustrating these approaches are presented in Figure Assume that a test scheduling algorithm optimizes test application time while available channels are limited by the maximum of TAM wires. Figure 2.11a shows a non-partitioned (session-

31 2.3. System-on-a-chip testing 15 ATE channels ATE ATE channels channels session 1 session 2 session 1 session 2 Test 2 Test session 2 1 session 2 Test 1 Test 3 Test 21 Test 3 Test 1 Test 3 (a) non-partitioned Test time Test time Test time ATE channels ATE ATE channels channels ATE channels ATE ATE channels channels Test 2 Test 3 Test 2 Test Test 3 1 Test 4 Test 21 Test 34 Test 1 Test 4 (b) partitioned Test 2A Test 3 Test 2A Test Test 3 1 Test 4 Test Test 2A 1 Test Test 3 4 Test 1 Test 4 Test time Test time Test time T 2B T 2B Test time T 2B Test time Test time (c) preemptive Figure 2.11: Test scheduling techniques based) technique which divides the test application time (TAT) into test sessions. It assumes that each test is assigned to a single session and, as a result, the new test cannot start as long as all tests in a previous session are not completed. Such a scheduling technique was applied in [102], [12]. However, diversity of test time requirements leads to idle times (black holes) when no core is tested and results in a long test application time. The idle test resources can be utilized by using a partitioned technique that allows to schedule a test as soon as test resources are available. Figure 2.11b shows that this more flexible approach improves test application time. However, it is worth to note that such a system requires more sophisticated test controller which permits one to initiate a test of core in an arbitrary time. Such a test scheduling technique was presented in [8] and [71]. Further test schedule optimization presented in [39], [61], [63] was achieved by using preemptive test scheduling illustrated in Figure 2.11c. This technique allows one to divide a single core test into partial tests delivered independently. For example, test T 2 is preempted and allocated in two separated time-resources slots. However, preemptive test scheduling is not applicable to all types of tests and thus not all cores can be tested in such a way. Especially the cores that require continuous handling must be tested in a single time slot. Moreover, an additional frequent test switching requires delivering extra control and an advanced test controller. Hence, a scheduling algorithm has to trade-off benefits of test preemption and the control data overhead.

32 16 Chapter 2. State of the art Test sharing and broadcasting Test patterns produced by ATPG contain a high number of unspecified bits. Several test sharing schemes that take advantage of this property to increase density of specified bits and reduce test data volume have been investigated in the literature [65], [55], [86]. For example, such patterns are then shared and broadcasted to various cores. A scheme proposed in [65] treats all cores in a design as a single virtual core. This allows one to generate common test patterns for all cores. However, such an approach requires an access to the gate-level core netlists that may not be available to engineers integrating test stimuli for an entire SoC design. An alternative scheme presented in [55] generates common tests for multiple cores based on tests delivered by the core providers. An enhanced logic simulator is used to evaluate the fault coverage for the design while a dedicated test pattern generator complements test data set to assure the same fault coverage. Certainly, these processes are typically too time-consuming to be considered in practical scenarios. In both mentioned schemes, the test stimuli are broadcasted to all cores in the system, testing them in parallel and the produced responses from each core are compacted using MISRs. In the method proposed in [86], the test pattern overlapping is enhanced for cores with various scan length. As a result, cores with a shorter scan chains must wait for those with the longest scan chains. The proposed methods reduce test cost factors. However, they increase dependencies between test patterns and significantly limit the flexibility during test scheduling and core assignment. Another approach to reducing test resources required in SoC testing shares output channels. The scheme presented in [85] introduces a specialized TAM for chips with multiple isolated identical cores through which all the cores can be tested in parallel while requiring similar ATE channels as for a single core. The proposed pipelined architecture forms nonlinear equations on a selected output pins that compress the outputs from the identical cores. Next, the proposed off-chip deductive solver allows one to reproduce the failure information for each core with no diagnostic resolution loss. The solution presented in [21] provides modular pipelined TAM that flexibly manages trade-offs between test time and diagnosis in the every production phase to gain test throughput Test architectures and scheduling co-optimization Co-optimizing the test-architecture design and test scheduling allows one to obtain a synergistic effect in test time and TAM resources reduction. A main reason for that lies in the trade-off (dependency) between the number of wrapper chains, test application time, and volume of test data. Results presented in [41] and illustrated in Figure 2.12 show that the test application time for a single core decreases gradually as a staircase (every stair

33 2.3. System-on-a-chip testing 17 begins with pareto-optimal point) with the increases of the number of wrapper chains. Independent selection of the proper wrapper chain configurations for all the cores in a SoC that ensures the best test schedule is very difficult. Hence, a test scheduler may deliver guidance about an expected test-architecture and in this way improve a test schedule taking into account a defined cost function. Figure 2.12: Relationship between TAT and TAM width [41] Several techniques addressing both test-architecture design and test scheduling techniques have been proposed [23], [43], [62], [83], [100], [34]. The TR-Architect scheme proposed in [23] addresses both test-architecture design and test scheduling algorithm that minimizes TAT and indicates the lower bound on the TAT for a given TAM width. A technique proposed in [43] utilizes a flexible width Test bus that can merge and fork channel resources between cores. As a result, different cores connected to the same TAM can utilize a different number of TAM wires at test application time. The preemptive test scheduling and TAM design are tightly integrated to minimize the test application time while considering test resource conflicts, preceedence, and power consumption constraints. In addition, the relation between TAM width and test-data volume is explored. A scheme presented in [62] consists of an integrated SoC test framework responsible for test selection, TAM design, and floor planning as well as test scheduler that minimizes TAT and TAM for given test resources constraints. A method presented in [83] introduces a virtual TAMs based scheme matching the high-speed ATE channels to slower scan chains. Indeed, the volume of test data remains the same. However, the effective number of virtual channels increases and permits reduction of the test application time.

34 18 Chapter 2. State of the art Technique presented in [100] enhances this concept by an introduction of multi-frequency virtual TAMs, thus increasing flexibility of channel packing. It was applied for both Test bus and TestRail TAM architectures considering TAM width and power constraints Test architectures with test data compression The next group of techniques integrate test-data compression with the test schedulingbased solutions. These techniques explore the benefits of test data compression presented in Section 2.2 to further reduce the most vital test cost factors (test application time and/or TAM width requirements). Several techniques co-optimizing test architecture design, test scheduling, and test-data compression have been proposed [24], [44], [97], [101]. A solution presented in [24] proposed a test-data compression driven TAM architecture design approach. XOR logic is used to perform compression, which, in turn, is employed together with test time estimates to devise TAM design heuristics. [44] investigates frequency-directed run-length codes [10] in SoC environment by a placement of one decoder per TAM wire or one decoder per core and managing such architecture by a bin-packing scheduling algorithm. [24] explores a static reseeding based approach. A single on-chip LFSR with an associated phase shifter provides concurrently a test stimuli to multiple cores. The investigated scheduling algorithm reduces the test application time by maximizing the number of each clock cycle encoded care-bits in the compiled seeds. The approaches proposed in [24], [97] use a single shared decoder which is designed at the SoC-level. Hence, they require a large number of TAM wires to achieve an acceptable test application time for the system. A technique presented in [101] enhances a concurrent core test by sharing tests broadcasted using the common one-wire TAM. An on-chip scan chain disable signal generator or an on-chip decoder is used to restore the core test stimuli from the shared test. All cores connected to the TAM wire share the same test data. Thus, test stimuli have to be serialized and, as a result, the test application time may considerably inflate for large SoCs. 2.4 Motivation As it has been shown in this chapter, the growth in popularity and complexity of SoC circuits can result in an increase of the most important test cost factors such as ATE channel resources, TAT, and hardware overhead. Several co-optimized test architecture designs and test scheduling techniques have been proposed. For test architecture design and test scheduling with test data compression, several approaches described require a

35 2.5. Thesis overview 19 SoC WIN LFSR WIN Phase shifter C1 C3 C2 Compactor WOUT Figure 2.13: Test-architecture with test-data compression [97] large number of TAM wires to achieve an acceptable test application time for the system. As most approaches use only one common decompressor which is designed at the SoClevel, feasibility to trade-off the achieved test-data compression with the test application time at core-level is significantly limited. Such an approach requires to recompute all test patterns for every change of SoC configuration. Moreover, the prior work does not provide an insight investigation of trade-offs between the compression at the core-level and the TAT in the overall SoC test scheme. Clearly, the SoC scheme framework has to coherently target the compression ratio as well as dynamic bandwidth management using flexible TAM architecture allowing tradeoffs between TAT and available channel resources. Another important issue is an area overhead introduced by the on-chip hardware necessary to decompress test data delivered by a tester. Finally, the architecture of a TAM should be scalable and circuit independent. The bandwidth management schemes that meet above conditions are proposed in the thesis. They are designed to work in different environments and offer different reductions of test cost factors. 2.5 Thesis overview The remaining part of the thesis consists of seven chapters. They introduce new solutions for SoC testing. A thesis structure, in accordance with a bottom-up approach, is illustrated in Figure Following the introductory chapters, the remaining six chapters fall into three parts. Chapters 3 and 4 propose core-level bandwidth-aware techniques for test compression and test compaction. Chapters 5 to 7 introduce test compression bandwidth management platform for SoC testing. They also present experimental results for several large SoC designs. These results show significant improvements in test compression and test application time. Finally, Chapter 8 contains conclusions. A concise abstract of the following chapters is presented bellow.

Testing of Digital System-on- Chip (SoC)

Testing of Digital System-on- Chip (SoC) Testing of Digital System-on- Chip (SoC) 1 Outline of the Talk Introduction to system-on-chip (SoC) design Approaches to SoC design SoC test requirements and challenges Core test wrapper P1500 core test

More information

What is a System on a Chip?

What is a System on a Chip? What is a System on a Chip? Integration of a complete system, that until recently consisted of multiple ICs, onto a single IC. CPU PCI DSP SRAM ROM MPEG SoC DRAM System Chips Why? Characteristics: Complex

More information

Design Verification & Testing Design for Testability and Scan

Design Verification & Testing Design for Testability and Scan Overview esign for testability (FT) makes it possible to: Assure the detection of all faults in a circuit Reduce the cost and time associated with test development Reduce the execution time of performing

More information

Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX

Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX White Paper Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX April 2010 Cy Hay Product Manager, Synopsys Introduction The most important trend

More information

TABLE OF CONTENTS. xiii List of Tables. xviii List of Design-for-Test Rules. xix Preface to the First Edition. xxi Preface to the Second Edition

TABLE OF CONTENTS. xiii List of Tables. xviii List of Design-for-Test Rules. xix Preface to the First Edition. xxi Preface to the Second Edition TABLE OF CONTENTS List of Figures xiii List of Tables xviii List of Design-for-Test Rules xix Preface to the First Edition xxi Preface to the Second Edition xxiii Acknowledgement xxv 1 Boundary-Scan Basics

More information

Introduction to Digital System Design

Introduction to Digital System Design Introduction to Digital System Design Chapter 1 1 Outline 1. Why Digital? 2. Device Technologies 3. System Representation 4. Abstraction 5. Development Tasks 6. Development Flow Chapter 1 2 1. Why Digital

More information

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING Hussain Al-Asaad and Alireza Sarvi Department of Electrical & Computer Engineering University of California Davis, CA, U.S.A.

More information

AN EFFICIENT ALGORITHM FOR WRAPPER AND TAM CO-OPTIMIZATION TO REDUCE TEST APPLICATION TIME IN CORE BASED SOC

AN EFFICIENT ALGORITHM FOR WRAPPER AND TAM CO-OPTIMIZATION TO REDUCE TEST APPLICATION TIME IN CORE BASED SOC International Journal of Electronics and Communication Engineering & Technology (IJECET) Volume 7, Issue 2, March-April 2016, pp. 09-17, Article ID: IJECET_07_02_002 Available online at http://www.iaeme.com/ijecet/issues.asp?jtype=ijecet&vtype=7&itype=2

More information

Implementation Details

Implementation Details LEON3-FT Processor System Scan-I/F FT FT Add-on Add-on 2 2 kbyte kbyte I- I- Cache Cache Scan Scan Test Test UART UART 0 0 UART UART 1 1 Serial 0 Serial 1 EJTAG LEON_3FT LEON_3FT Core Core 8 Reg. Windows

More information

VLSI Design Verification and Testing

VLSI Design Verification and Testing VLSI Design Verification and Testing Instructor Chintan Patel (Contact using email: cpatel2@cs.umbc.edu). Text Michael L. Bushnell and Vishwani D. Agrawal, Essentials of Electronic Testing, for Digital,

More information

DEVELOPING TRENDS OF SYSTEM ON A CHIP AND EMBEDDED SYSTEM

DEVELOPING TRENDS OF SYSTEM ON A CHIP AND EMBEDDED SYSTEM DEVELOPING TRENDS OF SYSTEM ON A CHIP AND EMBEDDED SYSTEM * Monire Norouzi Young Researchers and Elite Club, Shabestar Branch, Islamic Azad University, Shabestar, Iran *Author for Correspondence ABSTRACT

More information

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere! Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel

More information

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana

More information

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra

More information

Security in the Age of Nanocomputing. Hacking Devices

Security in the Age of Nanocomputing. Hacking Devices Security in the Age of Nanocomputing Matthew Tan Creti Hacking Devices The ESA estimates its total worldwide losses due to piracy at $3 billion annually [2] One million unlocked iphones could cost Apple

More information

Test Resource Partitioning and Reduced Pin-Count Testing Based on Test Data Compression

Test Resource Partitioning and Reduced Pin-Count Testing Based on Test Data Compression Test esource Partitioning and educed Pin-Count Testing Based on Test Data Compression nshuman Chandra and Krishnendu Chakrabarty Department of Electrical and Computer Engineering Duke University, Durham,

More information

Introduction to VLSI Testing

Introduction to VLSI Testing Introduction to VLSI Testing 李 昆 忠 Kuen-Jong Lee Dept. of Electrical Engineering National Cheng-Kung University Tainan, Taiwan, R.O.C. Introduction to VLSI Testing.1 Problems to Think A 32 bit adder A

More information

An Effective Deterministic BIST Scheme for Shifter/Accumulator Pairs in Datapaths

An Effective Deterministic BIST Scheme for Shifter/Accumulator Pairs in Datapaths An Effective Deterministic BIST Scheme for Shifter/Accumulator Pairs in Datapaths N. KRANITIS M. PSARAKIS D. GIZOPOULOS 2 A. PASCHALIS 3 Y. ZORIAN 4 Institute of Informatics & Telecommunications, NCSR

More information

G. Squillero, M. Rebaudengo. Test Techniques for Systems-on-a-Chip

G. Squillero, M. Rebaudengo. Test Techniques for Systems-on-a-Chip G. Squillero, M. Rebaudengo Test Techniques for Systems-on-a-Chip December 2005 Preface Fast innovation in VLSI technologies makes possible the integration a complete system into a single chip (System-on-Chip,

More information

Arbitrary Density Pattern (ADP) Based Reduction of Testing Time in Scan-BIST VLSI Circuits

Arbitrary Density Pattern (ADP) Based Reduction of Testing Time in Scan-BIST VLSI Circuits Arbitrary Density Pattern (ADP) Based Reduction of Testing Time in Scan-BIST VLSI Circuits G. Naveen Balaji S. Vinoth Vijay Abstract Test power reduction done by Arbitrary Density Patterns (ADP) in which

More information

System-on-Chip Test Scheduling and Test Infrastructure Design

System-on-Chip Test Scheduling and Test Infrastructure Design Linköping Studies in Science and Technology Thesis No. 1206 System-on-Chip Test Scheduling and Test Infrastructure Design by Anders Larsson Submitted to Linköping Institute of Technology at Linköping University

More information

Architectures and Platforms

Architectures and Platforms Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation

More information

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology Topics of Chapter 5 Sequential Machines Memory elements Memory elements. Basics of sequential machines. Clocking issues. Two-phase clocking. Testing of combinational (Chapter 4) and sequential (Chapter

More information

Test Time Minimization for Hybrid BIST of Systems-on-Chip

Test Time Minimization for Hybrid BIST of Systems-on-Chip TALLINN TECHNICAL UNIVERSITY Faculty of Information Technology Department of Computer Engineering Chair of Computer Engineering and Diagnostics Bachelor Thesis IAF34LT Test Time Minimization for Hybrid

More information

Programmable Logic IP Cores in SoC Design: Opportunities and Challenges

Programmable Logic IP Cores in SoC Design: Opportunities and Challenges Programmable Logic IP Cores in SoC Design: Opportunities and Challenges Steven J.E. Wilton and Resve Saleh Department of Electrical and Computer Engineering University of British Columbia Vancouver, B.C.,

More information

Non-Contact Test Access for Surface Mount Technology IEEE 1149.1-1990

Non-Contact Test Access for Surface Mount Technology IEEE 1149.1-1990 Non-Contact Test Access for Surface Mount Technology IEEE 1149.1-1990 ABSTRACT Mechanical and chemical process challenges initially limited acceptance of surface mount technology (SMT). As those challenges

More information

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and

More information

PowerPC Microprocessor Clock Modes

PowerPC Microprocessor Clock Modes nc. Freescale Semiconductor AN1269 (Freescale Order Number) 1/96 Application Note PowerPC Microprocessor Clock Modes The PowerPC microprocessors offer customers numerous clocking options. An internal phase-lock

More information

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems Harris Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems David Harris Harvey Mudd College David_Harris@hmc.edu Based on EE271 developed by Mark Horowitz, Stanford University MAH

More information

Memory Systems. Static Random Access Memory (SRAM) Cell

Memory Systems. Static Random Access Memory (SRAM) Cell Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled

More information

Design Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing

Design Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing Design Verification and Test of Digital VLSI Circuits NPTEL Video Course Module-VII Lecture-I Introduction to Digital VLSI Testing VLSI Design, Verification and Test Flow Customer's Requirements Specifications

More information

Lecture 7: Clocking of VLSI Systems

Lecture 7: Clocking of VLSI Systems Lecture 7: Clocking of VLSI Systems MAH, AEN EE271 Lecture 7 1 Overview Reading Wolf 5.3 Two-Phase Clocking (good description) W&E 5.5.1, 5.5.2, 5.5.3, 5.5.4, 5.5.9, 5.5.10 - Clocking Note: The analysis

More information

Encounter DFT Architect

Encounter DFT Architect Full-chip, synthesis-based, power-aware test architecture development Cadence Encounter DFT Architect addresses and optimizes multiple design and manufacturing objectives such as timing, area, wiring,

More information

A STUDY OF INSTRUMENT REUSE AND RETARGETING IN P1687

A STUDY OF INSTRUMENT REUSE AND RETARGETING IN P1687 A STUDY OF INSTRUMENT REUSE AND RETARGETING IN P1687 Farrokh Ghani Zadegan, Urban Ingelsson, Erik Larsson Linköping University Gunnar Carlsson Ericsson ABSTRACT Modern chips may contain a large number

More information

Power Reduction Techniques in the SoC Clock Network. Clock Power

Power Reduction Techniques in the SoC Clock Network. Clock Power Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a

More information

A New Multi-site Test for System-on-Chip Using Multi-site Star Test Architecture

A New Multi-site Test for System-on-Chip Using Multi-site Star Test Architecture A New Multi-site Test for System-on-Chip Using Multi-site Star Test Architecture Dongkwan Han, Yong Lee, and Sungho Kang As the system-on-chip (SoC) design becomes more complex, the test costs are increasing.

More information

Switched Interconnect for System-on-a-Chip Designs

Switched Interconnect for System-on-a-Chip Designs witched Interconnect for ystem-on-a-chip Designs Abstract Daniel iklund and Dake Liu Dept. of Physics and Measurement Technology Linköping University -581 83 Linköping {danwi,dake}@ifm.liu.se ith the increased

More information

An Open Architecture through Nanocomputing

An Open Architecture through Nanocomputing 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore An Open Architecture through Nanocomputing Joby Joseph1and A.

More information

CHAPTER 3 Boolean Algebra and Digital Logic

CHAPTER 3 Boolean Algebra and Digital Logic CHAPTER 3 Boolean Algebra and Digital Logic 3.1 Introduction 121 3.2 Boolean Algebra 122 3.2.1 Boolean Expressions 123 3.2.2 Boolean Identities 124 3.2.3 Simplification of Boolean Expressions 126 3.2.4

More information

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE INTRODUCTION TO DIGITAL SYSTEMS 1 DESCRIPTION AND DESIGN OF DIGITAL SYSTEMS FORMAL BASIS: SWITCHING ALGEBRA IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE COURSE EMPHASIS:

More information

International Journal of Electronics and Computer Science Engineering 1482

International Journal of Electronics and Computer Science Engineering 1482 International Journal of Electronics and Computer Science Engineering 1482 Available Online at www.ijecse.org ISSN- 2277-1956 Behavioral Analysis of Different ALU Architectures G.V.V.S.R.Krishna Assistant

More information

The Boundary Scan Test (BST) technology

The Boundary Scan Test (BST) technology The Boundary Scan Test () technology J. M. Martins Ferreira FEUP / DEEC - Rua Dr. Roberto Frias 42-537 Porto - PORTUGAL Tel. 35 225 8 748 / Fax: 35 225 8 443 (jmf@fe.up.pt / http://www.fe.up.pt/~jmf) Objectives

More information

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic Aims and Objectives E 3.05 Digital System Design Peter Cheung Department of Electrical & Electronic Engineering Imperial College London URL: www.ee.ic.ac.uk/pcheung/ E-mail: p.cheung@ic.ac.uk How to go

More information

Systolic Computing. Fundamentals

Systolic Computing. Fundamentals Systolic Computing Fundamentals Motivations for Systolic Processing PARALLEL ALGORITHMS WHICH MODEL OF COMPUTATION IS THE BETTER TO USE? HOW MUCH TIME WE EXPECT TO SAVE USING A PARALLEL ALGORITHM? HOW

More information

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad A. M. Niknejad University of California, Berkeley EE 100 / 42 Lecture 24 p. 1/20 EE 42/100 Lecture 24: Latches and Flip Flops ELECTRONICS Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad University of California,

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE CHAPTER 5 71 FINITE STATE MACHINE FOR LOOKUP ENGINE 5.1 INTRODUCTION Finite State Machines (FSMs) are important components of digital systems. Therefore, techniques for area efficiency and fast implementation

More information

The Evolution of ICT: PCB Technologies, Test Philosophies, and Manufacturing Business Models Are Driving In-Circuit Test Evolution and Innovations

The Evolution of ICT: PCB Technologies, Test Philosophies, and Manufacturing Business Models Are Driving In-Circuit Test Evolution and Innovations The Evolution of ICT: PCB Technologies, Test Philosophies, and Manufacturing Business Models Are Driving In-Circuit Test Evolution and Innovations Alan J. Albee Teradyne Inc. North Reading, Massachusetts

More information

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy Hardware Implementation of Improved Adaptive NoC Rer with Flit Flow History based Load Balancing Selection Strategy Parag Parandkar 1, Sumant Katiyal 2, Geetesh Kwatra 3 1,3 Research Scholar, School of

More information

7a. System-on-chip design and prototyping platforms

7a. System-on-chip design and prototyping platforms 7a. System-on-chip design and prototyping platforms Labros Bisdounis, Ph.D. Department of Computer and Communication Engineering 1 What is System-on-Chip (SoC)? System-on-chip is an integrated circuit

More information

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001 Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering

More information

CHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER

CHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER CHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER To provide the transparency of the system the user space is implemented in software as Scheduler. Given the sketch of the architecture, a low overhead scheduler

More information

From Bus and Crossbar to Network-On-Chip. Arteris S.A.

From Bus and Crossbar to Network-On-Chip. Arteris S.A. From Bus and Crossbar to Network-On-Chip Arteris S.A. Copyright 2009 Arteris S.A. All rights reserved. Contact information Corporate Headquarters Arteris, Inc. 1741 Technology Drive, Suite 250 San Jose,

More information

FPGA area allocation for parallel C applications

FPGA area allocation for parallel C applications 1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University

More information

VON BRAUN LABS. Issue #1 WE PROVIDE COMPLETE SOLUTIONS ULTRA LOW POWER STATE MACHINE SOLUTIONS VON BRAUN LABS. State Machine Technology

VON BRAUN LABS. Issue #1 WE PROVIDE COMPLETE SOLUTIONS ULTRA LOW POWER STATE MACHINE SOLUTIONS VON BRAUN LABS. State Machine Technology VON BRAUN LABS WE PROVIDE COMPLETE SOLUTIONS WWW.VONBRAUNLABS.COM Issue #1 VON BRAUN LABS WE PROVIDE COMPLETE SOLUTIONS ULTRA LOW POWER STATE MACHINE SOLUTIONS State Machine Technology IoT Solutions Learn

More information

Binary search tree with SIMD bandwidth optimization using SSE

Binary search tree with SIMD bandwidth optimization using SSE Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous

More information

Energy Efficient MapReduce

Energy Efficient MapReduce Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing

More information

BUILD VERSUS BUY. Understanding the Total Cost of Embedded Design. www.ni.com/buildvsbuy

BUILD VERSUS BUY. Understanding the Total Cost of Embedded Design. www.ni.com/buildvsbuy BUILD VERSUS BUY Understanding the Total Cost of Embedded Design Table of Contents I. Introduction II. The Build Approach: Custom Design a. Hardware Design b. Software Design c. Manufacturing d. System

More information

ni.com/sts NI Semiconductor Test Systems

ni.com/sts NI Semiconductor Test Systems ni.com/sts NI Semiconductor Test Systems Lower the Cost of Test With Semiconductor Test Systems The Semiconductor Test System (STS) series features fully production-ready test systems that use NI technology

More information

MICROPROCESSOR. Exclusive for IACE Students www.iace.co.in iacehyd.blogspot.in Ph: 9700077455/422 Page 1

MICROPROCESSOR. Exclusive for IACE Students www.iace.co.in iacehyd.blogspot.in Ph: 9700077455/422 Page 1 MICROPROCESSOR A microprocessor incorporates the functions of a computer s central processing unit (CPU) on a single Integrated (IC), or at most a few integrated circuit. It is a multipurpose, programmable

More information

9/14/2011 14.9.2011 8:38

9/14/2011 14.9.2011 8:38 Algorithms and Implementation Platforms for Wireless Communications TLT-9706/ TKT-9636 (Seminar Course) BASICS OF FIELD PROGRAMMABLE GATE ARRAYS Waqar Hussain firstname.lastname@tut.fi Department of Computer

More information

Operating Systems, 6 th ed. Test Bank Chapter 7

Operating Systems, 6 th ed. Test Bank Chapter 7 True / False Questions: Chapter 7 Memory Management 1. T / F In a multiprogramming system, main memory is divided into multiple sections: one for the operating system (resident monitor, kernel) and one

More information

Designing an efficient Programmable Logic Controller using Programmable System On Chip

Designing an efficient Programmable Logic Controller using Programmable System On Chip Designing an efficient Programmable Logic Controller using Programmable System On Chip By Raja Narayanasamy, Product Apps Manager Sr, Cypress Semiconductor Corp. A Programmable Logic Controller (PLC) is

More information

ESP-CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation

ESP-CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation Datasheet -CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation Overview -CV is an equivalence checker for full custom designs. It enables efficient comparison of a reference design

More information

The SA601: The First System-On-Chip for Guitar Effects By Thomas Irrgang, Analog Devices, Inc. & Roger K. Smith, Source Audio LLC

The SA601: The First System-On-Chip for Guitar Effects By Thomas Irrgang, Analog Devices, Inc. & Roger K. Smith, Source Audio LLC The SA601: The First System-On-Chip for Guitar Effects By Thomas Irrgang, Analog Devices, Inc. & Roger K. Smith, Source Audio LLC Introduction The SA601 is a mixed signal device fabricated in 0.18u CMOS.

More information

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin BUS ARCHITECTURES Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin Keywords: Bus standards, PCI bus, ISA bus, Bus protocols, Serial Buses, USB, IEEE 1394

More information

Digital Circuit Design

Digital Circuit Design Test and Diagnosis of of ICs Fault coverage (%) 95 9 85 8 75 7 65 97.92 SSL 4,246 Shawn Blanton Professor Department of ECE Center for Silicon System Implementation CMU Laboratory for Integrated Systems

More information

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead Clock - key to synchronous systems Topic 7 Clocking Strategies in VLSI Systems Peter Cheung Department of Electrical & Electronic Engineering Imperial College London Clocks help the design of FSM where

More information

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill Objectives: Analyze the operation of sequential logic circuits. Understand the operation of digital counters.

More information

Teaching the Importance of Data Correlation in Engineering Technology

Teaching the Importance of Data Correlation in Engineering Technology Session 3549 Teaching the Importance of Data Correlation in Engineering Technology Michael R. Warren, Dana M. Burnett, Jay R. Porter, and Rainer J. Fink Texas A&M University Abstract To meet the needs

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

SOC architecture and design

SOC architecture and design SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external

More information

Testing Mixed-Signal Cores: A Practical Oscillation-Based Test in an Analog Macrocell

Testing Mixed-Signal Cores: A Practical Oscillation-Based Test in an Analog Macrocell Testing Mixed-Signal Cores: A Practical Oscillation-Based Test in an Analog Macrocell Gloria Huertas, Diego Vázquez, Eduardo J. Peralías, Adoración Rueda, and José Luis Huertas Instituto de Microelectrónica

More information

JTAG Applications. Product Life-Cycle Support. Software Debug. Integration & Test. Figure 1. Product Life Cycle Support

JTAG Applications. Product Life-Cycle Support. Software Debug. Integration & Test. Figure 1. Product Life Cycle Support JTAG Applications While it is obvious that JTAG based testing can be used in the production phase of a product, new developments and applications of the IEEE-1149.1 standard have enabled the use of JTAG

More information

ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ, UK. *peter.harrod@arm.com

ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ, UK. *peter.harrod@arm.com Serial Wire Debug and the CoreSight TM Debug and Trace Architecture Eddie Ashfield, Ian Field, Peter Harrod *, Sean Houlihane, William Orme and Sheldon Woodhouse ARM Ltd 110 Fulbourn Road, Cambridge, CB1

More information

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1 System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect

More information

Computer Systems Structure Input/Output

Computer Systems Structure Input/Output Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices

More information

Continuous-Time Converter Architectures for Integrated Audio Processors: By Brian Trotter, Cirrus Logic, Inc. September 2008

Continuous-Time Converter Architectures for Integrated Audio Processors: By Brian Trotter, Cirrus Logic, Inc. September 2008 Continuous-Time Converter Architectures for Integrated Audio Processors: By Brian Trotter, Cirrus Logic, Inc. September 2008 As consumer electronics devices continue to both decrease in size and increase

More information

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary Fault Modeling Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults Single stuck-at faults Fault equivalence Fault dominance and checkpoint theorem Classes of stuck-at

More information

How To Design A Single Chip System Bus (Amba) For A Single Threaded Microprocessor (Mma) (I386) (Mmb) (Microprocessor) (Ai) (Bower) (Dmi) (Dual

How To Design A Single Chip System Bus (Amba) For A Single Threaded Microprocessor (Mma) (I386) (Mmb) (Microprocessor) (Ai) (Bower) (Dmi) (Dual Architetture di bus per System-On On-Chip Massimo Bocchi Corso di Architettura dei Sistemi Integrati A.A. 2002/2003 System-on on-chip motivations 400 300 200 100 0 19971999 2001 2003 2005 2007 2009 Transistors

More information

Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks

Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks Cheoljoo Jeong Steven M. Nowick Department of Computer Science Columbia University Outline Introduction Background Technology

More information

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Bradley R. Quinton Dept. of Electrical and Computer Engineering University of British Columbia bradq@ece.ubc.ca

More information

Nanocomputer & Architecture

Nanocomputer & Architecture Nanocomputer & Architecture Yingjie Wei Western Michigan University Department of Computer Science CS 603 - Dr. Elise dedonckor Febrary 4 th, 2004 Nanocomputer Architecture Contents Overview of Nanotechnology

More information

Switching and Finite Automata Theory

Switching and Finite Automata Theory Switching and Finite Automata Theory Understand the structure, behavior, and limitations of logic machines with this thoroughly updated third edition. New topics include: CMOS gates logic synthesis logic

More information

Simulating the Structural Evolution of Software

Simulating the Structural Evolution of Software Simulating the Structural Evolution of Software Benjamin Stopford 1, Steve Counsell 2 1 School of Computer Science and Information Systems, Birkbeck, University of London 2 School of Information Systems,

More information

Value Paper Author: Edgar C. Ramirez. Diverse redundancy used in SIS technology to achieve higher safety integrity

Value Paper Author: Edgar C. Ramirez. Diverse redundancy used in SIS technology to achieve higher safety integrity Value Paper Author: Edgar C. Ramirez Diverse redundancy used in SIS technology to achieve higher safety integrity Diverse redundancy used in SIS technology to achieve higher safety integrity Abstract SIS

More information

Introduction. 1.1 Motivation. Chapter 1

Introduction. 1.1 Motivation. Chapter 1 Chapter 1 Introduction The automotive, aerospace and building sectors have traditionally used simulation programs to improve their products or services, focusing their computations in a few major physical

More information

TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN

TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN USING DIFFERENT FOUNDRIES Priyanka Sharma 1 and Rajesh Mehra 2 1 ME student, Department of E.C.E, NITTTR, Chandigarh, India 2 Associate Professor, Department

More information

UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS

UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS Structure Page Nos. 2.0 Introduction 27 2.1 Objectives 27 2.2 Types of Classification 28 2.3 Flynn s Classification 28 2.3.1 Instruction Cycle 2.3.2 Instruction

More information

From Control Loops to Software

From Control Loops to Software CNRS-VERIMAG Grenoble, France October 2006 Executive Summary Embedded systems realization of control systems by computers Computers are the major medium for realizing controllers There is a gap between

More information

on-chip and Embedded Software Perspectives and Needs

on-chip and Embedded Software Perspectives and Needs Systems-on on-chip and Embedded Software - Perspectives and Needs Miguel Santana Central R&D, STMicroelectronics STMicroelectronics Outline Current trends for SoCs Consequences and challenges Needs: Tackling

More information

Universal Flash Storage: Mobilize Your Data

Universal Flash Storage: Mobilize Your Data White Paper Universal Flash Storage: Mobilize Your Data Executive Summary The explosive growth in portable devices over the past decade continues to challenge manufacturers wishing to add memory to their

More information

Chapter 4 Multi-Stage Interconnection Networks The general concept of the multi-stage interconnection network, together with its routing properties, have been used in the preceding chapter to describe

More information

Sequential Circuit Design

Sequential Circuit Design Sequential Circuit Design Lan-Da Van ( 倫 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2009 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines

More information

Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware

Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Mahyar Shahsavari, Zaid Al-Ars, Koen Bertels,1, Computer Engineering Group, Software & Computer Technology

More information

Chapter 2 Heterogeneous Multicore Architecture

Chapter 2 Heterogeneous Multicore Architecture Chapter 2 Heterogeneous Multicore Architecture 2.1 Architecture Model In order to satisfy the high-performance and low-power requirements for advanced embedded systems with greater fl exibility, it is

More information

Five Essential Components for Highly Reliable Data Centers

Five Essential Components for Highly Reliable Data Centers GE Intelligent Platforms Five Essential Components for Highly Reliable Data Centers Ensuring continuous operations with an integrated, holistic technology strategy that provides high availability, increased

More information

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. File: chap04, Chapter 04 1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. 2. True or False? A gate is a device that accepts a single input signal and produces one

More information

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1 Module 2 Embedded Processors and Memory Version 2 EE IIT, Kharagpur 1 Lesson 5 Memory-I Version 2 EE IIT, Kharagpur 2 Instructional Objectives After going through this lesson the student would Pre-Requisite

More information