1. Program A runs in 10 seconds on a machine with a 100 MHz clock. How many clock cycles does program A require?
|
|
|
- Elaine Simpson
- 9 years ago
- Views:
Transcription
1 (5 pts) Exercise Program A runs in 10 seconds on a machine with a 100 MHz clock. How many clock cycles does program A require? (5 pts) Exercise ) Our favorite program runs in 10 seconds on computer A, which has a 400 Mhz. clock. We are trying to help a computer designer build a new machine B, that will run this program in 6 seconds. The designer can use new (or perhaps more expensive) technology to substantially increase the clock rate, but has informed us that this increase will affect the rest of the CPU design, causing machine B to require 1.2 times as many clock cycles as machine A for the same program. What clock rate should we tell the designer to target?" 3.) Why might machine B need more clock cycles to run the program?
2 (10 pts) Exercise 1-53 We wish to compare the performance of two different computers: M1 and M2. The following measurements have been made on these computers: Time on M1 Time on M2 Program seconds 1.5 seconds Program seconds 10.0 seconds Which computer is faster for each program, and how many times as fast is it? (10 pts) Exercise 1-56 Consider the machines from the previous exercise, and assume the following additional measurements were made: Instructions executed on M1 Instructions executed on M2 Program 1 5 x x 10 9 What is the instruction execution rate (instructions per second) for each computer when running program 1?
3 (10 pts) Exercise 1-57 Suppose that M1 from Exercise 4-5 costs $500 and M2 costs $800. If you needed to run program 1 a large number of times, which computer would you buy in large quantities? Why? (5 pts) Exercise 1-61: MIPS Two different compilers are being tested for a 100 MHz. machine that has three different classes of instructions: Class A, Class B, and Class C, which require one, two, and three cycles (respectively). Both compilers are used to produce code for a large piece of software. Compiler #1: code uses 5 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions. Compiler #2: code uses 10 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions. Which sequence will be faster according to execution time? Which sequence will be faster according to MIPS? MIPS = Inst. Count / (ExecutionTime * 10 6 )
4 (5 pts) Exercise 1-62 Program A runs in 0.34 seconds on a 500 Mhz machine. You know that this program requires 100 million instructions of which: 10% are mult. instructions that take an unknown number of cycle 60% are other arithmetic instructions taking 1 cycle 30% are memory instructions taking 2 cycles How many cycles does a multiplication take on this machine? (5 pts) Exercise 1-63 Program A runs in 2 seconds on a certain machine. You know that this program requires 500 million instructions of which: 30% are multiplication instructions that take 10 cycles 40% are other arithmetic instructions taking 1 cycle 30% are memory instructions taking 2 cycles Suppose multiplication could be improved to take just 1 cycle. How much faster would the new machine be compared to the old?
5 (10 pts) Exercise 1-66 Consider two different implementations, P1 and P2, of the same instruction set. There are five classes of instructions (A-E), which have the following average CPI on the two machines CPI on P1 CPI on P2 Class A 1 2 Class B 2 2 Class C 3 2 Class D 4 4 Class E 3 4 P1 has a clock rate of 4 GHz, P2 has a clock rate of 6 GHz. If the number of instructions executed in a certain program is divided equally among the classes of instructions except for class A, which occurs twice as often as each of the others, how much faster is P2 than P1?
6 (10 pts) Exercise 1-67 Suppose you wish to run a program P with 7.5 x 10 9 instructions on a 5 GHz machine with a CPI of 0.8. a. What is the expected CPU time? b. When you run P, it takes 3 seconds of wall clock time to complete. What is the percentage of the CPU time P received? (5 pts) Exercise 1-71 Suppose we enhance a machine making all floating-point instructions run five times faster. If the execution time of some benchmark before the floating-point enhancement is 10 seconds, what will the speedup be if half of the 10 seconds is spent executing floating-point instructions? Formula: Time after Improve. = Exec. Time Unaffected +( Exe. Time Affected / Amount of Improvement)
7 (5 pts) Exercise 1-72 We are looking for a benchmark to show off the new floating-point unit described above, and want the overall benchmark to show a speedup of 3. One benchmark we are considering runs for 100 seconds with the old floating-point hardware. How much of the execution time would floatingpoint instructions have to account for in this program in order to yield our desired speedup on this benchmark? (10 pts) Exercise 1-75 (use Amdahl s Law) You are going to enhance a computer, and there are two possible improvements: either make multiply instructions run four times faster than before, or make memory access instructions run two times faster than before. You repeatedly run a program that takes 100 seconds to execute. Of this time, 20% is used for multiplication, 50% for memory access instructions, and 30% for other tasks. What will the speedup be if you improve only multiplication? What will the speedup be if you improve only memory access? What will the speedup be if both improvements are made?
Quiz for Chapter 1 Computer Abstractions and Technology 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,
on an system with an infinite number of processors. Calculate the speedup of
1. Amdahl s law Three enhancements with the following speedups are proposed for a new architecture: Speedup1 = 30 Speedup2 = 20 Speedup3 = 10 Only one enhancement is usable at a time. a) If enhancements
EEM 486: Computer Architecture. Lecture 4. Performance
EEM 486: Computer Architecture Lecture 4 Performance EEM 486 Performance Purchasing perspective Given a collection of machines, which has the» Best performance?» Least cost?» Best performance / cost? Design
Chapter 2. Why is some hardware better than others for different programs?
Chapter 2 1 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation Why is some hardware better than
CPU Performance. Lecture 8 CAP 3103 06-11-2014
CPU Performance Lecture 8 CAP 3103 06-11-2014 Defining Performance Which airplane has the best performance? 1.6 Performance Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747
EC 362 Problem Set #2
EC 362 Problem Set #2 1) Using Single Precision IEEE 754, what is FF28 0000? 2) Suppose the fraction enhanced of a processor is 40% and the speedup of the enhancement was tenfold. What is the overall speedup?
Performance evaluation
Performance evaluation Arquitecturas Avanzadas de Computadores - 2547021 Departamento de Ingeniería Electrónica y de Telecomunicaciones Facultad de Ingeniería 2015-1 Bibliography and evaluation Bibliography
Unit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.
This Unit CIS 501: Computer Architecture Unit 4: Performance & Benchmarking Metrics Latency and throughput Speedup Averaging CPU Performance Performance Pitfalls Slides'developed'by'Milo'Mar0n'&'Amir'Roth'at'the'University'of'Pennsylvania'
Central Processing Unit (CPU)
Central Processing Unit (CPU) CPU is the heart and brain It interprets and executes machine level instructions Controls data transfer from/to Main Memory (MM) and CPU Detects any errors In the following
Lecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?
Lecture 3: Evaluating Computer Architectures Announcements - Reminder: Homework 1 due Thursday 2/2 Last Time technology back ground Computer elements Circuits and timing Virtuous cycle of the past and
! Metrics! Latency and throughput. ! Reporting performance! Benchmarking and averaging. ! CPU performance equation & performance trends
This Unit CIS 501 Computer Architecture! Metrics! Latency and throughput! Reporting performance! Benchmarking and averaging Unit 2: Performance! CPU performance equation & performance trends CIS 501 (Martin/Roth):
Unit A451: Computer systems and programming. Section 2: Computing Hardware 1/5: Central Processing Unit
Unit A451: Computer systems and programming Section 2: Computing Hardware 1/5: Central Processing Unit Section Objectives Candidates should be able to: (a) State the purpose of the CPU (b) Understand the
Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.
Name: Class: Date: Exam #1 - Prep True/False Indicate whether the statement is true or false. 1. Programming is the process of writing a computer program in a language that the computer can respond to
Block diagram of typical laptop/desktop
What's in a computer? logical or functional organization: "architecture" what the pieces are, what they do, how they work how they are connected, how they work together what their properties are physical
Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University [email protected].
Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University [email protected] Review Computers in mid 50 s Hardware was expensive
Computer Organization. and Instruction Execution. August 22
Computer Organization and Instruction Execution August 22 CSC201 Section 002 Fall, 2000 The Main Parts of a Computer CSC201 Section Copyright 2000, Douglas Reeves 2 I/O and Storage Devices (lots of devices,
Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:
Pipelining HW Q. Can a MIPS SW instruction executing in a simple 5-stage pipelined implementation have a data dependency hazard of any type resulting in a nop bubble? If so, show an example; if not, prove
Introduction to RISC Processor. ni logic Pvt. Ltd., Pune
Introduction to RISC Processor ni logic Pvt. Ltd., Pune AGENDA What is RISC & its History What is meant by RISC Architecture of MIPS-R4000 Processor Difference Between RISC and CISC Pros and Cons of RISC
Performance Metrics and Scalability Analysis. Performance Metrics and Scalability Analysis
Performance Metrics and Scalability Analysis 1 Performance Metrics and Scalability Analysis Lecture Outline Following Topics will be discussed Requirements in performance and cost Performance metrics Work
Board Notes on Virtual Memory
Board Notes on Virtual Memory Part A: Why Virtual Memory? - Letʼs user program size exceed the size of the physical address space - Supports protection o Donʼt know which program might share memory at
A Lab Course on Computer Architecture
A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,
What's in a computer?
What's in a computer? logical or functional organization: "architecture" what the pieces are, what they do, how they work how they are connected, how they work together what their properties are physical
Execution Cycle. Pipelining. IF and ID Stages. Simple MIPS Instruction Formats
Execution Cycle Pipelining CSE 410, Spring 2005 Computer Systems http://www.cs.washington.edu/410 1. Instruction Fetch 2. Instruction Decode 3. Execute 4. Memory 5. Write Back IF and ID Stages 1. Instruction
Operation Count; Numerical Linear Algebra
10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point
SEQUENCES ARITHMETIC SEQUENCES. Examples
SEQUENCES ARITHMETIC SEQUENCES An ordered list of numbers such as: 4, 9, 6, 25, 36 is a sequence. Each number in the sequence is a term. Usually variables with subscripts are used to label terms. For example,
LSN 2 Computer Processors
LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2
Lecture 8: Binary Multiplication & Division
Lecture 8: Binary Multiplication & Division Today s topics: Addition/Subtraction Multiplication Division Reminder: get started early on assignment 3 1 2 s Complement Signed Numbers two = 0 ten 0001 two
İSTANBUL AYDIN UNIVERSITY
İSTANBUL AYDIN UNIVERSITY FACULTY OF ENGİNEERİNG SOFTWARE ENGINEERING THE PROJECT OF THE INSTRUCTION SET COMPUTER ORGANIZATION GÖZDE ARAS B1205.090015 Instructor: Prof. Dr. HASAN HÜSEYİN BALIK DECEMBER
2: Computer Performance
2: Computer Performance http://people.sc.fsu.edu/ jburkardt/presentations/ fdi 2008 lecture2.pdf... John Information Technology Department Virginia Tech... FDI Summer Track V: Parallel Programming 10-12
Week 1 out-of-class notes, discussions and sample problems
Week 1 out-of-class notes, discussions and sample problems Although we will primarily concentrate on RISC processors as found in some desktop/laptop computers, here we take a look at the varying types
Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 2 Basic Structure of Computers Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Functional Units Basic Operational Concepts Bus Structures Software
Computer Architecture-I
Computer Architecture-I 1. Die Yield is given by the formula, Assignment 1 Solution Die Yield = Wafer Yield x (1 + (Defects per unit area x Die Area)/a) -a Let us assume a wafer yield of 100% and a 4 for
(Refer Slide Time: 00:01:16 min)
Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control
System Models for Distributed and Cloud Computing
System Models for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Classification of Distributed Computing Systems
EE282 Computer Architecture and Organization Midterm Exam February 13, 2001. (Total Time = 120 minutes, Total Points = 100)
EE282 Computer Architecture and Organization Midterm Exam February 13, 2001 (Total Time = 120 minutes, Total Points = 100) Name: (please print) Wolfe - Solution In recognition of and in the spirit of the
CSEE W4824 Computer Architecture Fall 2012
CSEE W4824 Computer Architecture Fall 2012 Lecture 2 Performance Metrics and Quantitative Principles of Computer Design Luca Carloni Department of Computer Science Columbia University in the City of New
Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.
Objectives The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Identify the components of the central processing unit and how they work together and interact with memory Describe how
Quantitative Computer Architecture
Performace Measuremet ad Aalysis i Computer Quatitative Computer Measuremet Model Iovatio Proposed How to measure, aalyze, ad specify computer system performace or My computer is faster tha your computer!
COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING
COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING 2013/2014 1 st Semester Sample Exam January 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc.
EE361: Digital Computer Organization Course Syllabus
EE361: Digital Computer Organization Course Syllabus Dr. Mohammad H. Awedh Spring 2014 Course Objectives Simply, a computer is a set of components (Processor, Memory and Storage, Input/Output Devices)
Chapter 5 Instructor's Manual
The Essentials of Computer Organization and Architecture Linda Null and Julia Lobur Jones and Bartlett Publishers, 2003 Chapter 5 Instructor's Manual Chapter Objectives Chapter 5, A Closer Look at Instruction
Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1
Pipeline Hazards Structure hazard Data hazard Pipeline hazard: the major hurdle A hazard is a condition that prevents an instruction in the pipe from executing its next scheduled pipe stage Taxonomy of
The BBP Algorithm for Pi
The BBP Algorithm for Pi David H. Bailey September 17, 2006 1. Introduction The Bailey-Borwein-Plouffe (BBP) algorithm for π is based on the BBP formula for π, which was discovered in 1995 and published
Faster polynomial multiplication via multipoint Kronecker substitution
Faster polynomial multiplication via multipoint Kronecker substitution 5th February 2009 Kronecker substitution KS = an algorithm for multiplying polynomials in Z[x]. Example: f = 41x 3 + 49x 2 + 38x +
CISC, RISC, and DSP Microprocessors
CISC, RISC, and DSP Microprocessors Douglas L. Jones ECE 497 Spring 2000 4/6/00 CISC, RISC, and DSP D.L. Jones 1 Outline Microprocessors circa 1984 RISC vs. CISC Microprocessors circa 1999 Perspective:
Finding Rates and the Geometric Mean
Finding Rates and the Geometric Mean So far, most of the situations we ve covered have assumed a known interest rate. If you save a certain amount of money and it earns a fixed interest rate for a period
Chapter 2 Logic Gates and Introduction to Computer Architecture
Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are
CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on
CS 61C: Great Ideas in Computer Architecture Finite State Machines Instructors: Krste Asanovic & Vladimir Stojanovic hbp://inst.eecs.berkeley.edu/~cs61c/sp15 1 Levels of RepresentaKon/ InterpretaKon High
PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0
PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0 15 th January 2014 Al Chrosny Director, Software Engineering TreeAge Software, Inc. [email protected] Andrew Munzer Director, Training and Customer
Problems and Measures Regarding Waste 1 Management and 3R Era of public health improvement Situation subsequent to the Meiji Restoration
Parallel Scalable Algorithms- Performance Parameters
www.bsc.es Parallel Scalable Algorithms- Performance Parameters Vassil Alexandrov, ICREA - Barcelona Supercomputing Center, Spain Overview Sources of Overhead in Parallel Programs Performance Metrics for
Pipelining Review and Its Limitations
Pipelining Review and Its Limitations Yuri Baida [email protected] [email protected] October 16, 2010 Moscow Institute of Physics and Technology Agenda Review Instruction set architecture Basic
Don t Slow Me Down with that Calculator Cliff Petrak (Teacher Emeritus) Brother Rice H.S. Chicago [email protected]
Don t Slow Me Down with that Calculator Cliff Petrak (Teacher Emeritus) Brother Rice H.S. Chicago [email protected] In any computation, we have four ideal objectives to meet: 1) Arriving at the correct
High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates
High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of
The Methodology of Application Development for Hybrid Architectures
Computer Technology and Application 4 (2013) 543-547 D DAVID PUBLISHING The Methodology of Application Development for Hybrid Architectures Vladimir Orekhov, Alexander Bogdanov and Vladimir Gaiduchok Department
LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.
LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability
Quiz for Chapter 6 Storage and Other I/O Topics 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [6 points] Give a concise answer to each
Memory management basics (1) Requirements (1) Objectives. Operating Systems Part of E1.9 - Principles of Computers and Software Engineering
Memory management basics (1) Requirements (1) Operating Systems Part of E1.9 - Principles of Computers and Software Engineering Lecture 7: Memory Management I Memory management intends to satisfy the following
Communicating with devices
Introduction to I/O Where does the data for our CPU and memory come from or go to? Computers communicate with the outside world via I/O devices. Input devices supply computers with data to operate on.
LECTURE NOTES ON MACROECONOMIC PRINCIPLES
LECTURE NOTES ON MACROECONOMIC PRINCIPLES Peter Ireland Department of Economics Boston College [email protected] http://www2.bc.edu/peter-ireland/ec132.html Copyright (c) 2013 by Peter Ireland. Redistribution
Sequences. A sequence is a list of numbers, or a pattern, which obeys a rule.
Sequences A sequence is a list of numbers, or a pattern, which obeys a rule. Each number in a sequence is called a term. ie the fourth term of the sequence 2, 4, 6, 8, 10, 12... is 8, because it is the
6 3 4 9 = 6 10 + 3 10 + 4 10 + 9 10
Lesson The Binary Number System. Why Binary? The number system that you are familiar with, that you use every day, is the decimal number system, also commonly referred to as the base- system. When you
Lattice QCD Performance. on Multi core Linux Servers
Lattice QCD Performance on Multi core Linux Servers Yang Suli * Department of Physics, Peking University, Beijing, 100871 Abstract At the moment, lattice quantum chromodynamics (lattice QCD) is the most
Choosing a Computer for Running SLX, P3D, and P5
Choosing a Computer for Running SLX, P3D, and P5 This paper is based on my experience purchasing a new laptop in January, 2010. I ll lead you through my selection criteria and point you to some on-line
4/1/2017. PS. Sequences and Series FROM 9.2 AND 9.3 IN THE BOOK AS WELL AS FROM OTHER SOURCES. TODAY IS NATIONAL MANATEE APPRECIATION DAY
PS. Sequences and Series FROM 9.2 AND 9.3 IN THE BOOK AS WELL AS FROM OTHER SOURCES. TODAY IS NATIONAL MANATEE APPRECIATION DAY 1 Oh the things you should learn How to recognize and write arithmetic sequences
Course on Advanced Computer Architectures
Course on Advanced Computer Architectures Surname (Cognome) Name (Nome) POLIMI ID Number Signature (Firma) SOLUTION Politecnico di Milano, September 3rd, 2015 Prof. C. Silvano EX1A ( 2 points) EX1B ( 2
IT@Intel. Comparing Multi-Core Processors for Server Virtualization
White Paper Intel Information Technology Computer Manufacturing Server Virtualization Comparing Multi-Core Processors for Server Virtualization Intel IT tested servers based on select Intel multi-core
Comparative Performance Review of SHA-3 Candidates
Comparative Performance Review of the SHA-3 Second-Round Candidates Cryptolog International Second SHA-3 Candidate Conference Outline sphlib sphlib sphlib is an open-source implementation of many hash
The Central Processing Unit:
The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Objectives Identify the components of the central processing unit and how they work together and interact with memory Describe how
HY345 Operating Systems
HY345 Operating Systems Recitation 2 - Memory Management Solutions Panagiotis Papadopoulos [email protected] Problem 7 Consider the following C program: int X[N]; int step = M; //M is some predefined constant
How To Understand The Design Of A Microprocessor
Computer Architecture R. Poss 1 What is computer architecture? 2 Your ideas and expectations What is part of computer architecture, what is not? Who are computer architects, what is their job? What is
Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:
55 Topic 3 Computer Performance Contents 3.1 Introduction...................................... 56 3.2 Measuring performance............................... 56 3.2.1 Clock Speed.................................
Introduction to Operating Systems. Perspective of the Computer. System Software. Indiana University Chen Yu
Introduction to Operating Systems Indiana University Chen Yu Perspective of the Computer System Software A general piece of software with common functionalities that support many applications. Example:
Enhancing Cloud-based Servers by GPU/CPU Virtualization Management
Enhancing Cloud-based Servers by GPU/CPU Virtualiz Management Tin-Yu Wu 1, Wei-Tsong Lee 2, Chien-Yu Duan 2 Department of Computer Science and Inform Engineering, Nal Ilan University, Taiwan, ROC 1 Department
ELECTENG702 Advanced Embedded Systems. Improving AES128 software for Altera Nios II processor using custom instructions
Assignment ELECTENG702 Advanced Embedded Systems Improving AES128 software for Altera Nios II processor using custom instructions October 1. 2005 Professor Zoran Salcic by Kilian Foerster 10-8 Claybrook
Performance metrics for parallel systems
Performance metrics for parallel systems S.S. Kadam C-DAC, Pune [email protected] C-DAC/SECG/2006 1 Purpose To determine best parallel algorithm Evaluate hardware platforms Examine the benefits from parallelism
Figure 1: Graphical example of a mergesort 1.
CSE 30321 Computer Architecture I Fall 2011 Lab 02: Procedure Calls in MIPS Assembly Programming and Performance Total Points: 100 points due to its complexity, this lab will weight more heavily in your
Enabling Technologies for Distributed Computing
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies
CS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions
CS101 Lecture 26: Low Level Programming John Magee 30 July 2013 Some material copyright Jones and Bartlett 1 Overview/Questions What did we do last time? How can we control the computer s circuits? How
Conversions between percents, decimals, and fractions
Click on the links below to jump directly to the relevant section Conversions between percents, decimals and fractions Operations with percents Percentage of a number Percent change Conversions between
Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks
Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks Garron K. Morris Senior Project Thermal Engineer [email protected] Standard Drives Division Bruce W. Weiss Principal
ARM Microprocessor and ARM-Based Microcontrollers
ARM Microprocessor and ARM-Based Microcontrollers Nguatem William 24th May 2006 A Microcontroller-Based Embedded System Roadmap 1 Introduction ARM ARM Basics 2 ARM Extensions Thumb Jazelle NEON & DSP Enhancement
Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit
Decimal Division Remember 4th grade long division? 43 // quotient 12 521 // divisor dividend -480 41-36 5 // remainder Shift divisor left (multiply by 10) until MSB lines up with dividend s Repeat until
what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?
Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the
$496. 80. Example If you can earn 6% interest, what lump sum must be deposited now so that its value will be $3500 after 9 months?
Simple Interest, Compound Interest, and Effective Yield Simple Interest The formula that gives the amount of simple interest (also known as add-on interest) owed on a Principal P (also known as present
In mathematics, it is often important to get a handle on the error term of an approximation. For instance, people will write
Big O notation (with a capital letter O, not a zero), also called Landau's symbol, is a symbolism used in complexity theory, computer science, and mathematics to describe the asymptotic behavior of functions.
64-Bit versus 32-Bit CPUs in Scientific Computing
64-Bit versus 32-Bit CPUs in Scientific Computing Axel Kohlmeyer Lehrstuhl für Theoretische Chemie Ruhr-Universität Bochum March 2004 1/25 Outline 64-Bit and 32-Bit CPU Examples
Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving
Section 7 Algebraic Manipulations and Solving Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving Before launching into the mathematics, let s take a moment to talk about the words
26 Integers: Multiplication, Division, and Order
26 Integers: Multiplication, Division, and Order Integer multiplication and division are extensions of whole number multiplication and division. In multiplying and dividing integers, the one new issue
Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance
What You Will Learn... Computers Are Your Future Chapter 6 Understand how computers represent data Understand the measurements used to describe data transfer rates and data storage capacity List the components
Financial Mathematics
Financial Mathematics For the next few weeks we will study the mathematics of finance. Apart from basic arithmetic, financial mathematics is probably the most practical math you will learn. practical in
Algebra I Notes Relations and Functions Unit 03a
OBJECTIVES: F.IF.A.1 Understand the concept of a function and use function notation. Understand that a function from one set (called the domain) to another set (called the range) assigns to each element
Python Programming: An Introduction to Computer Science
Python Programming: An Introduction to Computer Science Chapter 1 Computers and Programs 1 Objectives To understand the respective roles of hardware and software in a computing system. To learn what computer
The Mathematics 11 Competency Test Percent Increase or Decrease
The Mathematics 11 Competency Test Percent Increase or Decrease The language of percent is frequently used to indicate the relative degree to which some quantity changes. So, we often speak of percent
Computer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications
1 A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications Simon McIntosh-Smith Director of Architecture 2 Multi-Threaded Array Processing Architecture
