FF-Bond: Multi-bit Flip-flop Bonding at Placement CHANG-CHENG TSAI YIYU SHI GUOJIE LUO IRIS HUI-RU JIANG. IRIS Lab NCTU MST PKU
|
|
- Hester Tyler
- 7 years ago
- Views:
Transcription
1 FF-Bond: Multi-bit Flip-flop Bonding at Placement CHANG-CHENG TSAI YIYU SHI GUOJIE LUO IRIS HUI-RU JIANG IRIS Lab NCTU MST PKU ISPD-13
2 Outline 2 Introduction Preliminaries Problem formulation Algorithm - FF-Bond Experimental results Conclusion
3 Multi-Bit Flip-Flops (MBFFs) 3 Clock power is critical for modern IC designs Dynamic power CV 2 f MBFFs present a smaller load on the clock network Replace FFs with MBFFs Effectively reduces both clock network power and MBFF power Prefer large FFs (high bit number), avoid orphans (single-bit FF) Avoid impacting timing critical paths D Master Slave 1 Q 1 latch latch clk D 2 Master latch Slave latch Q 2 Bit number Normalized power per bit Power efficient Normalized area per bit
4 Prior Work 4 Relocating flip-flops benefits clock network synthesis [Cheon+, DAC05], [Papa+, Micro11], [Lee/Markov, TCAD12] Replacing flip-flops with MBFFs saves clock power [Yan/Chen, ICGCS10], [Chang+, ICCAD10], [Wang+, ISPD11] [Jiang+, ISPD11], [Liu+, DATE12] Focus on post-placement MBFF clustering Pre-placement Lack physical information Post-placement Cells are immovable Limited clustering flexibility and quality MBFF bonding at-placement
5 One Possible Solution 5 Directly integrate placement & post-placement MBFF clustering Netlist Timing-driven placement MBFF clustering End The movement of flip-flops Is constrained by the placement at the current iteration May oscillate among iterations
6 Ionic Bonding and Flip-flop Bonding 6 Goal: Guide flip-flops towards merging friendly locations e Na F Ionic bonding Na F Flip-flop Na + F NaF Flip-flop bonding Mergeable flip-flop sets Example: MBFF library: 1-bit, 2-bit, 4-bit
7 Post-Placement vs. At-Placement - s Post-placement clustering At-placement bonding # MBFFs 4-/2-/1-bit 35/252/237 SBFF MBFF # MBFFs 4-/2-/1-bit 159/105/35
8 Outline 8 Introduction Preliminaries Problem formulation Algorithm - FF-Bond Experimental results Conclusion
9 Post-Placement MBFF Clustering 9 Given A placed design MBFF library Timing slacks of flip-flops Replace FFs with MBFFs Minimize flip-flop power Satisfy timing constraints SBFF MBFF MBFF Clustering
10 Intersection Graph 10 Define the feasible region of a flip-flop according to its slack Model the overlap of feasible regions by an intersection graph A proper-sized clique corresponds to an MBFF Feasible region f o (i) Fanin slack f i (i) i Fanout slack y Intersection graph x
11 INTEGRA (INTErval GRAph) 11 Perform coordinate transformation Sort starting (s) and ending (e) points of projection in the x and y axes y x FF1 FF2 FF3 FF4 FF5 FF6 FF7 FF8 TYPE s s s s e e s s s e s e e e e e FF# Jiang et al. INTEGRA: Fast multibit flip-flop clustering for clock power saving, TCAD12, ISPD11.
12 INTEGRA 12 Find decision points se in x axis Retrieve maximal cliques at decision points y Check x and y axes {1, 2, 4} or {1, 3, 4} Form MBFFs of proper sizes e.g., {1, 2} FF1 FF2 FF3 FF4 FF5 FF6 FF7 FF8 x FF1 FF3 FF4 FF2 FF6 FF5 FF7 FF8 e 1 e 3 e 4 s 3 e 2 s 4 s 1 s 2 T Y P E F F # TYPE s s s s e e s s s e s e e e e e FF# Decision points
13 Example: INTEGRA 13 Example: MBFF library: 1-bit, 2-bit, 4-bit FF1 FF3 FF4 FF6 FF5 FF8 2 7 y FF2 FF7 x Guide 2 dual-bit four-bit FFs flip-flops towards merging 1 four-bit friendly flip-flop locations
14 Outline 14 Introduction Preliminaries Problem formulation Algorithm - FF-Bond Experimental results Conclusion
15 The MBFF Bonding at Placement Problem 15 Given Gate-level netlist MBFF library Timing constraints Find a placement and replace FFs with MBFFs Minimize flip-flop power Satisfy timing constraints
16 Outline 16 Introduction Preliminaries Problem formulation Algorithm - FF-Bond Experimental results Conclusion
17 The Overview of FF-Bond 17 Guide flip-flops towards merging friendly locations at the global placement stage without sacrificing timing Netlist Global placement FF-Bond FF-Bond Objective function construction with timing-driven net weighting Signoff timer Legalization Gradient-based optimization solver Detailed placement Clock tree synthesis N Sparse enough? < d 2 Y Flip-flop bonding N Evenly distributed? < d 1 Y Routing Pseudo-net generation MBFF clustering End : overlap index = overlapped_area total_cell_area
18 Example: s Netlist Global placement Objective function construction with timing-driven net weighting FF-Bond Signoff timer Gradient-based optimization solver N Sparse enough? < d 2 Y N Evenly distributed? < d 1 Y Flip-flop bonding Pseudo-net generation MBFF clustering : overlap index = overlapped_area total_cell_area
19 Example: s Spread cells until sparse enough Netlist Global placement Objective function construction with timing-driven net weighting FF-Bond Signoff timer Gradient-based optimization solver N Sparse enough? < d 2 Y N Evenly distributed? < d 1 Y Flip-flop bonding Pseudo-net generation MBFF clustering : overlap index = overlapped_area total_cell_area
20 Example: s Apply flip-flop bonding Netlist Global placement Objective function construction with timing-driven net weighting FF-Bond Signoff timer Gradient-based optimization solver N Sparse enough? < d 2 Y N Evenly distributed? < d 1 Y Flip-flop bonding Pseudo-net generation MBFF clustering : overlap index = overlapped_area total_cell_area
21 Timing-Driven Placement 21 Pure wirelength-driven placement + slack-based net-weighting Pure wirelength-driven analytical placement: mpl min W x, y = e E max x i x j + max y i y j v i,v j e,i<j v i,v j e,i<j s. t. D ij = K, 1 i m, 1 j n Smooth the objective function and the constraints Log-sum-exp approximation Inverse Laplace transformation Slack-based net weighting W x, y = η e E log v k e exp x k η + log v k e exp x k η min W x, y s. t. ψ ij = K, 1 i m, 1 j n net weight = 1 slack α, α > 1 T clk + log slack = 0 for the 1 st iteration (pure wirelength-driven placement) v k e exp y k η + log v k e exp y k η T. Chen et al. Multilevel generalized force-directed method for circuit placement, ISPD05.
22 Flip-Flop Bonding 22 Bond flip-flops into perfect-sized cliques Perfect: most power efficient Example: Power efficient Bit number Normalized power per bit Normalized area per bit Oversized Perfect Undersized
23 Flip-Flop Bonding 23 Bond flip-flops into perfect-sized cliques Priority of processing maximal cliques: Perfect > undersize > oversize Perfect size: preserved Undersize/oversize: try to form a target-sized clique by selecting the nearest flip-flops in a specified search region The target size: the flip-flop configuration that is larger than, nearest to, and more power efficient than the investigated clique size Adjacency inside the search region Physical & timing: x c x i + y c y i ε i s fi i + s fo i
24 Example: Flip-Flop Bonding (1/3) 24 Extract maximal cliques Flip-flop Bit number Normalized power per bit Power efficient Normalized area per bit
25 Example: Flip-Flop Bonding (2/3) 25 Bonding strategy Choose an undersized clique with priority 3>2>1 Select nearest flip-flops to form target-sized cliques Choose an oversized clique Select nearest flip-flops to form target-sized cliques Even 4X Odd even
26 Example: Flip-Flop Bonding (3/3) 26 Anchor Direct flip-flops to each other Set a flip-flop bonding anchor
27 Overlapping Maximal Cliques Bonding strategy Priority: undersized > oversized Undersized 3 > 2 > Oversized Even 4X Odd even Independency (#shared flip-flops)/cliquesize
28 Pseudo Net Generation 28 Set an anchor Pseudo-net f o (i) i Overlapped feasible region between i and j i Pseudo net f o (j) f i (i) j Anchor j Anchor High pseudo-net weight f i (j)
29 Outline 29 Introduction Preliminaries Problem formulation Algorithm - FF-Bond Experimental results Conclusion
30 Experimental Setting 30 Programming language: C++ Platform: Intel Xeon 2.4GHz CPU, 16GB memory, Linux OS Technology: UMC 55nm Faraday MBFF library Bit number Signoff timer: PrimeTime Normalized power per bit Normalized area per bit
31 Comparison with Two Representative Flows 31 PMC IMC Post-placement MBFF clustering INTEGRA Interleaving placement & post-placement MBFF clustering Timing-driven mpl + INTEGRA Netlist Netlist Timing-driven placement Timing-driven placement MBFF clustering MBFF clustering End End
32 32 Flip-flop Power Comparison Power ratio FF-Bond < IMC < PMC Lower bound of power ratio: 0.78 (All are 4-bit MBFFs) Main constituents of formed FFs FF-Bond: 4-bit FFs, PMC: 1-bit FFs, IMC: 2-bit FFs Wirelength PMC < IMC < FF-Bond Timing is satisfied, the induced power increment < 1% PMC IMC FF-Bond Circuit #Flipflops 4-/2-/1-bit ratio 4-/2-/1-bit ratio 4-/2-/1-bit ratio #MBFFs FF power #MBFFs FF power #MBFFs FF power HPWL HPWL Name HPWL s /57/ E+06 23/51/ E+06 35/31/ E+06 s /29/ E+06 18/23/ E+06 23/15/ E+06 s /252/ E /179/ E /105/ E+07 s /291/ E+07 96/282/ E /116/ E+07 b /264/ E /201/ E /124/ E+08 b /886/ E /742/ E /425/ E+08 Avg. ratio /0.91/ /2.05/ /3.33/ d 1 = 0.1, d 2 = 0.5, = 1.2, the search region is bounded by 20% of chip dimension the net weight of a pseudo net is 5X the default value for a positive slack net.
33 33 Experimental Results s38417 s38417 PMC IMC FF-Bond Global placement result (before MBFF merging) MBFF merging result (before legalization) #MBFFs (4-/2-/1-bit) 35/252/ /179/ /105/35
34 Experimental Results s PMC: Post-placement MBFF clustering Global placement SBFF
35 Experimental Results s PMC: Post-placement MBFF clustering MBFF merging SBFF MBFF 4-/2-/1-bit MBFFs 35/252/237
36 Experimental Results s IMC: Interleaving placement and MBFF clustering Global placement SBFF
37 Experimental Results s IMC: Interleaving placement and MBFF clustering MBFF merging SBFF MBFF 4-/2-/1-bit MBFFs 105/179/103
38 Experimental Results s FF-Bond Global placement SBFF
39 Experimental Results s FF-Bond MBFF merging SBFF MBFF 4-/2-/1-bit MBFFs 159/105/35
40 Clock Power Comparison 40 Clock power (flip-flops + clock network) FF-Bond < IMC < PMC < Without MBFF FF-Bond totally saves 27% clock power on average Compared with PMC, FF-Bond further reduces 14% clock power Circuit Name Without MBFF clustering PMC IMC FF-Bond Total Total Total Total Cap. Sinks Buffer Wire Cap. Sinks Buffer Wire Cap. Sinks Buffer Wire Cap. (pf) (pf) (pf) (pf) Sinks Buffer Wire s % 36.4% 15.1% % 39.7% 13.5% % 38.8% 11.7% % 40.2% 10.0% s % 47.1% 9.6% % 50.6% 8.5% % 52.6% 7.8% % 53.1% 7.4% s % 31.6% 15.2% % 26.8% 15.3% % 27.4% 14.5% % 28.6% 12.9% s % 28.9% 17.6% % 29.9% 15.8% % 24.1% 15.3% % 32.8% 13.3% b % 28.1% 20.0% % 28.7% 19.5% % 30.5% 17.6% % 26.3% 15.5% b % 26.8% 21.0% % 28.1% 19.5% % 23.9% 18.7% % 24.5% 16.8% Avg. Ratio
41 Slack Comparison: PMC vs. FF-Bond 41 The slacks of the two flows are quite similar PMC FF-Bond Circuit Name Clock period Worst slack Average slack Worst slack Average slack (ns) (ns) (ns) (ns) (ns) s s s s b b Worst slack represents the worst timing slack over all endpoints Average slack means the average
42 FF-Bond: d 2 Analysis 42 d 2 : the sparsity criterion to apply flip-flop bonding Smaller d 2 : flip-flop bonding starts at later iterations but does not guide FFs well because of the low flexibility of moving FFs Power, WL Larger d 2 : flip-flop bonding starts at earlier iterations but incurs longer wirelength because distant flip-flops are attracted Power (slightly), WL d 2 =0.3 d 2 =0.5 d 2 =0.7 Circuit #Flipflops 4-/2-/1-bit ratio 4-/2-/1-bit ratio 4-/2-/1-bit ratio #MBFFs FF power #MBFFs FF power #MBFFs FF power HPWL HPWL Name HPWL s /41/ E+06 35/31/ E+06 38/25/ E+06 s /20/ E+06 23/15/ E+06 22/16/ E+06 s /85/ E /105/ E /89/ E+07 s /135/ E /116/ E /134/ E+07 b /135/ E /124/ E /116/ E+08 b /427/ E /425/ E /431/ E+08 Avg. ratio /3.44/ /3.33/ /3.04/
43 Wirelength FF power ratio 43 Search Region Analysis E E E E E E E E E+07 Search region, power ratio, wirelength Search Region vs. FF Power Ratio Unit: chip_width+chip_height Search Region vs. Wirelength Unit: chip_width+chip_height s38417 Search region FF power (Unit: chip_width+chip_height) ratio HPWL Worst slack (ns) E E E E E E E E
44 Outline 44 Introduction Preliminaries Problem formulation Algorithm - FF-Bond Experimental results Conclusion
45 Conclusions 45 MBFFs reduce clock load effectively Prior work: post-placement MBFF clustering Proposed MBFF bonding at-placement FF-Bond Directed flip-flops towards merging friendly locations FF-Bond can save 27% clock power on average Compared with post-placement MBFF clustering, FF-Bond can further reduce 14% clock power Future work: MBFF bonding with routability consideration
46 46 Thank You! Contact info: Iris Hui-Ru Jiang
A Utility for Leakage Power Recovery within PrimeTime 1 SI
within PrimeTime 1 SI Bruce Zahn LSI Corporation Bruce.Zahn@lsi.com ABSTRACT This paper describes a utility which is run within the PrimeTime SI signoff environment that recovers leakage power and achieves
More informationUniversity of Texas at Dallas. Department of Electrical Engineering. EEDG 6306 - Application Specific Integrated Circuit Design
University of Texas at Dallas Department of Electrical Engineering EEDG 6306 - Application Specific Integrated Circuit Design Synopsys Tools Tutorial By Zhaori Bi Minghua Li Fall 2014 Table of Contents
More informationPerimeter-degree: A Priori Metric for Directly Measuring and Homogenizing Interconnection Complexity in Multilevel Placement
Perimeter-degree: A Priori Metric for Directly Measuring and Homogenizing Interconnection Complexity in Multilevel Placement Navaratnasothie Selvakkumaran (Selva) selva@cs.umn.edu 1 Authors Navaratnasothie.
More informationTIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING
TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING BARIS TASKIN, JOHN WOOD, IVAN S. KOURTEV February 28, 2005 Research Objective Objective: Electronic design automation
More informationHow To Design A Chip Layout
Spezielle Anwendungen des VLSI Entwurfs Applied VLSI design (IEF170) Course and contest Intermediate meeting 3 Prof. Dirk Timmermann, Claas Cornelius, Hagen Sämrow, Andreas Tockhorn, Philipp Gorski, Martin
More informationTopics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology
Topics of Chapter 5 Sequential Machines Memory elements Memory elements. Basics of sequential machines. Clocking issues. Two-phase clocking. Testing of combinational (Chapter 4) and sequential (Chapter
More informationOptimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks
Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks Cheoljoo Jeong Steven M. Nowick Department of Computer Science Columbia University Outline Introduction Background Technology
More informationLecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method
Lecture 3 3B1B Optimization Michaelmas 2015 A. Zisserman Linear Programming Extreme solutions Simplex method Interior point method Integer programming and relaxation The Optimization Tree Linear Programming
More informationImplementation of Multi-Bit flip-flop for Power Reduction in CMOS Technologies
Implementation of Multi-Bit flip-flop for Power Reduction in CMOS Technologies K. Siva Prasad, G. Rajesh, V. Thrimurthulu Post Graduate student, Department of ECE, CREC, TIRUPATI, A.P., India Associate
More informationRethinking SIMD Vectorization for In-Memory Databases
SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest
More informationAchieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
More informationReconfigurable ECO Cells for Timing Closure and IR Drop Minimization. TingTing Hwang Tsing Hua University, Hsin-Chu
Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization TingTing Hwang Tsing Hua University, Hsin-Chu 1 Outline Introduction Engineering Change Order (ECO) Voltage drop (IR-DROP) New design
More information2003 IC/CAD Contest Problem 5: Clock Tree Optimization for Useful Skew
2003 IC/CAD Contest Problem 5: Clock Tree Optimization for Useful Skew Source: Global UniChip Corp. February 11, 2003, revised on March 21, 2003 & April 22, 2003 1. Introduction There are two approaches
More informationGlitch Free Frequency Shifting Simplifies Timing Design in Consumer Applications
Glitch Free Frequency Shifting Simplifies Timing Design in Consumer Applications System designers face significant design challenges in developing solutions to meet increasingly stringent performance and
More informationTiming Methodologies (cont d) Registers. Typical timing specifications. Synchronous System Model. Short Paths. System Clock Frequency
Registers Timing Methodologies (cont d) Sample data using clock Hold data between clock cycles Computation (and delay) occurs between registers efinition of terms setup time: minimum time before the clocking
More informationIL2225 Physical Design
IL2225 Physical Design Nasim Farahini farahini@kth.se Outline Physical Implementation Styles ASIC physical design Flow Floor and Power planning Placement Clock Tree Synthesis Routing Timing Analysis Verification
More informationGate Delay Model. Estimating Delays. Effort Delay. Gate Delay. Computing Logical Effort. Logical Effort
Estimating Delays Would be nice to have a back of the envelope method for sizing gates for speed Logical Effort Book by Sutherland, Sproull, Harris Chapter 1 is on our web page Also Chapter 4 in our textbook
More informationScheduling Shop Scheduling. Tim Nieberg
Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations
More informationExternal Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13
External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing
More informationPower-Aware High-Performance Scientific Computing
Power-Aware High-Performance Scientific Computing Padma Raghavan Scalable Computing Laboratory Department of Computer Science Engineering The Pennsylvania State University http://www.cse.psu.edu/~raghavan
More informationCS311 Lecture: Sequential Circuits
CS311 Lecture: Sequential Circuits Last revised 8/15/2007 Objectives: 1. To introduce asynchronous and synchronous flip-flops (latches and pulsetriggered, plus asynchronous preset/clear) 2. To introduce
More informationDefinition 11.1. Given a graph G on n vertices, we define the following quantities:
Lecture 11 The Lovász ϑ Function 11.1 Perfect graphs We begin with some background on perfect graphs. graphs. First, we define some quantities on Definition 11.1. Given a graph G on n vertices, we define
More informationDesigning a Schematic and Layout in PCB Artist
Designing a Schematic and Layout in PCB Artist Application Note Max Cooper March 28 th, 2014 ECE 480 Abstract PCB Artist is a free software package that allows users to design and layout a printed circuit
More informationCollective behaviour in clustered social networks
Collective behaviour in clustered social networks Maciej Wołoszyn 1, Dietrich Stauffer 2, Krzysztof Kułakowski 1 1 Faculty of Physics and Applied Computer Science AGH University of Science and Technology
More informationSolving polynomial least squares problems via semidefinite programming relaxations
Solving polynomial least squares problems via semidefinite programming relaxations Sunyoung Kim and Masakazu Kojima August 2007, revised in November, 2007 Abstract. A polynomial optimization problem whose
More informationRoute Power 10 Connect Powerpin 10.1 Route Special Route 10.2 Net(s): VSS VDD
SOCE Lab (2/2): Clock Tree Synthesis and Routing Lab materials are available at ~cvsd/cur/soce/powerplan.tar.gz Please untar the file in the folder SOCE_Lab before lab 1 Open SOC Encounter 1.1 % source
More informationHigh-fidelity electromagnetic modeling of large multi-scale naval structures
High-fidelity electromagnetic modeling of large multi-scale naval structures F. Vipiana, M. A. Francavilla, S. Arianos, and G. Vecchi (LACE), and Politecnico di Torino 1 Outline ISMB and Antenna/EMC Lab
More informationECE 3401 Lecture 7. Concurrent Statements & Sequential Statements (Process)
ECE 3401 Lecture 7 Concurrent Statements & Sequential Statements (Process) Concurrent Statements VHDL provides four different types of concurrent statements namely: Signal Assignment Statement Simple Assignment
More informationSecurity-Aware Beacon Based Network Monitoring
Security-Aware Beacon Based Network Monitoring Masahiro Sasaki, Liang Zhao, Hiroshi Nagamochi Graduate School of Informatics, Kyoto University, Kyoto, Japan Email: {sasaki, liang, nag}@amp.i.kyoto-u.ac.jp
More informationBinary Image Scanning Algorithm for Cane Segmentation
Binary Image Scanning Algorithm for Cane Segmentation Ricardo D. C. Marin Department of Computer Science University Of Canterbury Canterbury, Christchurch ricardo.castanedamarin@pg.canterbury.ac.nz Tom
More informationOn the effect of forwarding table size on SDN network utilization
IBM Haifa Research Lab On the effect of forwarding table size on SDN network utilization Rami Cohen IBM Haifa Research Lab Liane Lewin Eytan Yahoo Research, Haifa Seffi Naor CS Technion, Israel Danny Raz
More informationEfficient and Robust Allocation Algorithms in Clouds under Memory Constraints
Efficient and Robust Allocation Algorithms in Clouds under Memory Constraints Olivier Beaumont,, Paul Renaud-Goud Inria & University of Bordeaux Bordeaux, France 9th Scheduling for Large Scale Systems
More informationCHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE
CHAPTER 5 71 FINITE STATE MACHINE FOR LOOKUP ENGINE 5.1 INTRODUCTION Finite State Machines (FSMs) are important components of digital systems. Therefore, techniques for area efficiency and fast implementation
More informationIntroduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems
Harris Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems David Harris Harvey Mudd College David_Harris@hmc.edu Based on EE271 developed by Mark Horowitz, Stanford University MAH
More informationExternal Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing
More informationFast Iterative Graph Computation with Resource Aware Graph Parallel Abstraction
Human connectome. Gerhard et al., Frontiers in Neuroinformatics 5(3), 2011 2 NA = 6.022 1023 mol 1 Paul Burkhardt, Chris Waring An NSA Big Graph experiment Fast Iterative Graph Computation with Resource
More informationExample-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic
Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic Clifford Wolf, Johann Glaser, Florian Schupfer, Jan Haase, Christoph Grimm Computer Technology /99 Overview Ultra-Low-Power
More informationSimPL: An Effective Placement Algorithm
SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov University of Michigan, Department of EECS, 226 Hayward St., Ann Arbor, MI 4819-2121 {mckima, ejdjsy, imarkov}@eecs.umich.edu
More informationModeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw. Sequential Circuit
Modeling Sequential Elements with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw 4-1 Sequential Circuit Outputs are functions of inputs and present states of storage elements
More informationSequential 4-bit Adder Design Report
UNIVERSITY OF WATERLOO Faculty of Engineering E&CE 438: Digital Integrated Circuits Sequential 4-bit Adder Design Report Prepared by: Ian Hung (ixxxxxx), 99XXXXXX Annette Lo (axxxxxx), 99XXXXXX Pamela
More informationLecture 5: Gate Logic Logic Optimization
Lecture 5: Gate Logic Logic Optimization MAH, AEN EE271 Lecture 5 1 Overview Reading McCluskey, Logic Design Principles- or any text in boolean algebra Introduction We could design at the level of irsim
More informationChapter 13: Query Processing. Basic Steps in Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
More informationAlpha CPU and Clock Design Evolution
Alpha CPU and Clock Design Evolution This lecture uses two papers that discuss the evolution of the Alpha CPU and clocking strategy over three CPU generations Gronowski, Paul E., et.al., High Performance
More informationCLASS: Combined Logic Architecture Soft Error Sensitivity Analysis
CLASS: Combined Logic Architecture Soft Error Sensitivity Analysis Mojtaba Ebrahimi, Liang Chen, Hossein Asadi, and Mehdi B. Tahoori INSTITUTE OF COMPUTER ENGINEERING (ITEC) CHAIR FOR DEPENDABLE NANO COMPUTING
More informationClosest Pair Problem
Closest Pair Problem Given n points in d-dimensions, find two whose mutual distance is smallest. Fundamental problem in many applications as well as a key step in many algorithms. p q A naive algorithm
More informationMcPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures
McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures Sheng Li, Junh Ho Ahn, Richard Strong, Jay B. Brockman, Dean M Tullsen, Norman Jouppi MICRO 2009
More informationBIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION
BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION Ş. İlker Birbil Sabancı University Ali Taylan Cemgil 1, Hazal Koptagel 1, Figen Öztoprak 2, Umut Şimşekli
More informationImproving Grid Processing Efficiency through Compute-Data Confluence
Solution Brief GemFire* Symphony* Intel Xeon processor Improving Grid Processing Efficiency through Compute-Data Confluence A benchmark report featuring GemStone Systems, Intel Corporation and Platform
More informationAsynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow
Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Bradley R. Quinton Dept. of Electrical and Computer Engineering University of British Columbia bradq@ece.ubc.ca
More informationNTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter
NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter Description: The NTE2053 is a CMOS 8 bit successive approximation Analog to Digital converter in a 20 Lead DIP type package which uses a differential
More informationFairchild Solutions for 133MHz Buffered Memory Modules
AN-5009 Fairchild Semiconductor Application Note April 1999 Revised December 2000 Fairchild Solutions for 133MHz Buffered Memory Modules Fairchild Semiconductor provides several products that are compatible
More informationR-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants
R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions
More informationData Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
More informationETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies
ETEC 2301 Programmable Logic Devices Chapter 10 Counters Shawnee State University Department of Industrial and Engineering Technologies Copyright 2007 by Janna B. Gallaher Asynchronous Counter Operation
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. The Best-fit Heuristic for the Rectangular Strip Packing Problem: An Efficient Implementation
MATHEMATICAL ENGINEERING TECHNICAL REPORTS The Best-fit Heuristic for the Rectangular Strip Packing Problem: An Efficient Implementation Shinji IMAHORI, Mutsunori YAGIURA METR 2007 53 September 2007 DEPARTMENT
More informationLLamasoft K2 Enterprise 8.1 System Requirements
Overview... 3 RAM... 3 Cores and CPU Speed... 3 Local System for Operating Supply Chain Guru... 4 Applying Supply Chain Guru Hardware in K2 Enterprise... 5 Example... 6 Determining the Correct Number of
More informationPERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0
PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0 15 th January 2014 Al Chrosny Director, Software Engineering TreeAge Software, Inc. achrosny@treeage.com Andrew Munzer Director, Training and Customer
More informationCIS 700: algorithms for Big Data
CIS 700: algorithms for Big Data Lecture 6: Graph Sketching Slides at http://grigory.us/big-data-class.html Grigory Yaroslavtsev http://grigory.us Sketching Graphs? We know how to sketch vectors: v Mv
More informationCluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
More informationAsking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
More informationTo design digital counter circuits using JK-Flip-Flop. To implement counter using 74LS193 IC.
8.1 Objectives To design digital counter circuits using JK-Flip-Flop. To implement counter using 74LS193 IC. 8.2 Introduction Circuits for counting events are frequently used in computers and other digital
More informationLecture 7: Clocking of VLSI Systems
Lecture 7: Clocking of VLSI Systems MAH, AEN EE271 Lecture 7 1 Overview Reading Wolf 5.3 Two-Phase Clocking (good description) W&E 5.5.1, 5.5.2, 5.5.3, 5.5.4, 5.5.9, 5.5.10 - Clocking Note: The analysis
More informationPhysical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
More informationRectifier circuits & DC power supplies
Rectifier circuits & DC power supplies Goal: Generate the DC voltages needed for most electronics starting with the AC power that comes through the power line? 120 V RMS f = 60 Hz T = 1667 ms) = )sin How
More informationLinear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.
Linear Programming Widget Factory Example Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory:, Vertices, Existence of Solutions. Equivalent formulations.
More informationNonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,
More informationPower Analysis of Link Level and End-to-end Protection in Networks on Chip
Power Analysis of Link Level and End-to-end Protection in Networks on Chip Axel Jantsch, Robert Lauter, Arseni Vitkowski Royal Institute of Technology, tockholm May 2005 ICA 2005 1 ICA 2005 2 Overview
More informationNimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff
Nimble Algorithms for Cloud Computing Ravi Kannan, Santosh Vempala and David Woodruff Cloud computing Data is distributed arbitrarily on many servers Parallel algorithms: time Streaming algorithms: sublinear
More informationOperating Systems. Virtual Memory
Operating Systems Virtual Memory Virtual Memory Topics. Memory Hierarchy. Why Virtual Memory. Virtual Memory Issues. Virtual Memory Solutions. Locality of Reference. Virtual Memory with Segmentation. Page
More informationVirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5
Performance Study VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 VMware VirtualCenter uses a database to store metadata on the state of a VMware Infrastructure environment.
More informationCSC2420 Spring 2015: Lecture 3
CSC2420 Spring 2015: Lecture 3 Allan Borodin January 22, 2015 1 / 1 Announcements and todays agenda Assignment 1 due next Thursday. I may add one or two additional questions today or tomorrow. Todays agenda
More informationScalable Prefix Matching for Internet Packet Forwarding
Scalable Prefix Matching for Internet Packet Forwarding Marcel Waldvogel Computer Engineering and Networks Laboratory Institut für Technische Informatik und Kommunikationsnetze Background Internet growth
More informationThermal-Aware 3D IC Physical Design and Architecture Exploration
Thermal-Aware 3D IC Physical Design and Architecture Exploration Jason Cong UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu cadlab.cs.ucla.edu/~cong Joint Work with Yongxiang
More informationIMPLEMENTATION OF BACKEND SYNTHESIS AND STATIC TIMING ANALYSIS OF PROCESSOR LOCAL BUS(PLB) PERFORMANCE MONITOR
International Journal of Engineering & Science Research IMPLEMENTATION OF BACKEND SYNTHESIS AND STATIC TIMING ANALYSIS OF PROCESSOR LOCAL BUS(PLB) PERFORMANCE MONITOR ABSTRACT Pathik Gandhi* 1, Milan Dalwadi
More informationHYBRID GENETIC ALGORITHMS FOR SCHEDULING ADVERTISEMENTS ON A WEB PAGE
HYBRID GENETIC ALGORITHMS FOR SCHEDULING ADVERTISEMENTS ON A WEB PAGE Subodha Kumar University of Washington subodha@u.washington.edu Varghese S. Jacob University of Texas at Dallas vjacob@utdallas.edu
More informationPerformance Evaluation of Linux Bridge
Performance Evaluation of Linux Bridge James T. Yu School of Computer Science, Telecommunications, and Information System (CTI) DePaul University ABSTRACT This paper studies a unique network feature, Ethernet
More informationSeeking Opportunities for Hardware Acceleration in Big Data Analytics
Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who
More informationAtmel Norway 2005. XMEGA Introduction
Atmel Norway 005 XMEGA Introduction XMEGA XMEGA targets Leadership on Peripheral Performance Leadership in Low Power Consumption Extending AVR market reach XMEGA AVR family 44-100 pin packages 16K 51K
More informationContinuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information
Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Eric Hsueh-Chan Lu Chi-Wei Huang Vincent S. Tseng Institute of Computer Science and Information Engineering
More informationAirWave 7.7. Server Sizing Guide
AirWave 7.7 Server Sizing Guide Copyright 2013 Aruba Networks, Inc. Aruba Networks trademarks include, Aruba Networks, Aruba Wireless Networks, the registered Aruba the Mobile Edge Company logo, Aruba
More informationClocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1
ing Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 s 1 Why s and Storage Elements? Inputs Combinational Logic Outputs Want to reuse combinational logic from cycle to cycle 6.884 - Spring 2005 2/18/05
More informationLinear Programming for Optimization. Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc.
1. Introduction Linear Programming for Optimization Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc. 1.1 Definition Linear programming is the name of a branch of applied mathematics that
More informationGraySort on Apache Spark by Databricks
GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner
More informationStrategic Online Advertising: Modeling Internet User Behavior with
2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew
More informationLinear Programming. March 14, 2014
Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1
More informationContent Distribution Management
Digitizing the Olympics was truly one of the most ambitious media projects in history, and we could not have done it without Signiant. We used Signiant CDM to automate 54 different workflows between 11
More informationDesign Compiler Graphical Create a Better Starting Point for Faster Physical Implementation
Datasheet Create a Better Starting Point for Faster Physical Implementation Overview Continuing the trend of delivering innovative synthesis technology, Design Compiler Graphical delivers superior quality
More informationR u t c o r Research R e p o r t. A Method to Schedule Both Transportation and Production at the Same Time in a Special FMS.
R u t c o r Research R e p o r t A Method to Schedule Both Transportation and Production at the Same Time in a Special FMS Navid Hashemian a Béla Vizvári b RRR 3-2011, February 21, 2011 RUTCOR Rutgers
More informationLecture 5: Logical Effort
Introduction to CMOS VLSI Design Lecture 5: Logical Effort David Harris Harvey Mudd College Spring 2004 Outline Introduction Delay in a Logic Gate Multistage Logic Networks Choosing the Best Number of
More informationCentral Processing Unit (CPU)
Central Processing Unit (CPU) CPU is the heart and brain It interprets and executes machine level instructions Controls data transfer from/to Main Memory (MM) and CPU Detects any errors In the following
More informationExperiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa
Experiment # 9 Clock generator circuits & Counters Eng. Waleed Y. Mousa 1. Objectives: 1. Understanding the principles and construction of Clock generator. 2. To be familiar with clock pulse generation
More informationRoots of Equations (Chapters 5 and 6)
Roots of Equations (Chapters 5 and 6) Problem: given f() = 0, find. In general, f() can be any function. For some forms of f(), analytical solutions are available. However, for other functions, we have
More informationA 2-Slot Time-Division Multiplexing (TDM) Interconnect Network for Gigascale Integration (GSI)
A 2-Slot Time-Division Multiplexing (TDM) Interconnect Network for Gigascale Integration (GSI) Ajay Joshi and Jeff Davis AIMD Research Group Georgia Institute of Technology Sponsored by: NSF # 0092450
More informationAn Energy-aware Multi-start Local Search Metaheuristic for Scheduling VMs within the OpenNebula Cloud Distribution
An Energy-aware Multi-start Local Search Metaheuristic for Scheduling VMs within the OpenNebula Cloud Distribution Y. Kessaci, N. Melab et E-G. Talbi Dolphin Project Team, Université Lille 1, LIFL-CNRS,
More informationLecture 4: BK inequality 27th August and 6th September, 2007
CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound
More informationIEOR 4404 Homework #2 Intro OR: Deterministic Models February 14, 2011 Prof. Jay Sethuraman Page 1 of 5. Homework #2
IEOR 4404 Homework # Intro OR: Deterministic Models February 14, 011 Prof. Jay Sethuraman Page 1 of 5 Homework #.1 (a) What is the optimal solution of this problem? Let us consider that x 1, x and x 3
More informationArchitecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7
Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7 Yan Fisher Senior Principal Product Marketing Manager, Red Hat Rohit Bakhshi Product Manager,
More informationLayout of Multiple Cells
Layout of Multiple Cells Beyond the primitive tier primitives add instances of primitives add additional transistors if necessary add substrate/well contacts (plugs) add additional polygons where needed
More informationPROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics
PROGETTO DI SISTEMI ELETTRONICI DIGITALI Digital Systems Design Digital Circuits Advanced Topics 1 Sequential circuit and metastability 2 Sequential circuit - FSM A Sequential circuit contains: Storage
More informationHPSA Agent Characterization
HPSA Agent Characterization Product HP Server Automation (SA) Functional Area Managed Server Agent Release 9.0 Page 1 HPSA Agent Characterization Quick Links High-Level Agent Characterization Summary...
More information