Performance evaluation

Size: px

Start display at page:

Download "Performance evaluation"

Abraham Richards
10 years ago
Views:

Performance evaluation Arquitecturas Avanzadas de Computadores - 2547021

1 Performance evaluation Arquitecturas Avanzadas de Computadores Departamento de Ingeniería Electrónica y de Telecomunicaciones Facultad de Ingeniería

2 Bibliography and evaluation Bibliography Lecture slides Chapter 4: Computer Organization and Design The hardware/software interface, D. A. Patterson y J. L. Henessy, Morgan Kaufman Publishers, 3rd Edition, Chapter 1: Computer architecture A quantitative approach, J. Henessy and D. Patterson, Morgan Kaufman, 5th Edition, 2011 (previous editions may be good too). Evaluation Test I (15%) covering units 1-2 2

3 How good is a computer? We can think of many parameters: Porcessor s clock rate Power consumed by a program Execution time for a program Number of tasks done per second Reliability Aesthetic appearance Social repercussion, etc These are the metrics, the things we want to estimate or measure (not all of them are easy to measure though) How should we compare two computer systems? 3

4 Performance: Latency vs. Throughput Latency: time to finish a fixed task Throughput: number of tasks per unit of time Different: exploit parallelism for throughput, not latency Usually a trade-off: latency vs. throughput Choose definition of performance that matches your goals Scientific program: latency; web server: throughput? Example: transport people 10 km Car: capacity = 5, speed = 60 kmh Bus: capacity = 60, speed = 20 kmh Latency: car = 10 min, bus = 30 min Throughput: car = 15 pph (count return trip), bus = 60 pph 4

5 Example: latency vs. throughput Do the following changes to a computer system increase throughput, decrease response time or both? a) Replacing the processor with a faster version b) Adding more processors to a systems that uses multiple processors for separate tasks (a web sever) Answer a) Both b) Throughput 5

6 Comparing Performance System a is x times faster than b if latency a = latency(b) x throughput a = throughput b x System a is x% faster than b if latency a = latency(b) (1 + x 100) throughput a = throughput b (1 + x 100) Car/bus example Latency? Car is 3 times (and 200%) faster than bus Throughput? Bus is 4 times (and 300%) faster than car 6

7 Performance definitions Let s define our final goal as to minimize the execution time for some application, then we can define performance in terms of execution time as follows: performance a = 1 execution_time(a) 7

8 Execution time Execution time is affected by multiple factors in a computer system: execution time = CPU time + disk access + memory access + I/O activities + OS overhead We will focus on CPU time since we ll study mostly the processor. However, some applications depend heavily on e.g. disk access performance. 8

9 CPU time We measure CPU time in seconds, but Remember that computer HW works synchronously, with a clock signal, having a period and a frequency data reg logic reg clock How to relate clock cycles with CPU time? 9

10 Clock cycles and CPU time Just use one of the two simple formulas: CPU time = clock cycles * cycle time Or using clock rate CPU time = clock cycles / cycle rate Classic designer s tradeoff : Attempting to reduce the clock cycles may lead to reducing the clock rate too, and vice versa 10

11 Book exercise 11

12 Answer 12

13 How about instructions? Since a program executes instructions, they should also play a part in the CPU performance equations So far we had: CPU time = clock cycles * cycle time Now we will also say that: clock cycles = instructions for a program * average clock cycles per instruction IC: Instruction Count Static IC vs. dynamic IC What is needed to determine each? CPI: Cycles Per Instruction Can be used to compare two ISA implementations 13

14 14

15 The CPU performance equation Finally, the classic formula that incorporates the three key factors that affect performance is: CPU time = Instruction Count * CPI * cycle time Or CPU time = Instruction Count * CPI / clock rate 15

16 CPU Performance Equation Factors affecting CPU execution time: Factor Inst. count CPI Clock rate Program x (x) Compiler x (x) ISA x x (x) Microarchitecture x x Technology x CPU time = Instruction Count * CPI / clock rate 16

17 Cycles per Instruction (CPI) Depends on the instruction CPIi = Execution Time of Instruction i * Clock Rate Computing the total CPI: Example: program dependent! 17

18 Another CPI Example Assume a processor with instruction frequencies and costs Integer ALU: 50%, 1 cycle Load: 20%, 5 cycle Store: 10%, 1 cycle Branch: 20%, 2 cycle Which change would improve performance more? a) Faster branch prediction to reduce branch cost to 1 cycle? b) Better data cache to reduce load cost to 3 cycles? Compute CPI Base = 0.5* * * *2 = 2 A = 0.5* * * *1 = 1.8 B = 0.5* * * *2 = 1.6 (winner) 18

19 Book example 19

20 Answer 20

IPC, MIPS and GHz The metrics you are most likely to see in marketing are IPC (instruction per cycle), MIPS (million instruction per second) and GHz How are they incomplete?

21 IPC, MIPS and GHz The metrics you are most likely to see in marketing are IPC (instruction per cycle), MIPS (million instruction per second) and GHz How are they incomplete? Back to the CPU time formula: 1/IPC 1/MIPS 1/GHz Which processor would you buy? Processor A: CPI = 2, clock = 5 GHz Processor B: CPI = 1, clock = 3 GHz Probably A, but B is faster (assuming same ISA/compiler) Meta-point: danger of partial performance metrics! GHz can be boosted artificially by design (lower the other 2 terms) e.g., 800 MHz PentiumIII faster than 1 GHz Pentium4! 21

22 Gene Amdahl American computer architect Born in 1922 Worked for IBM until 1970 Founded Amdahl Corporation to compete in the mainframe market against IBM Proposed the later known as Amdahl s Law during the 1967 Spring Joint Computer Conference 22

$Amdahl s law Suppose an enhancement speeds up a fraction f of a task by a factor of Sf If f is small$

23 Amdahl s law Suppose an enhancement speeds up a fraction f of a task by a factor of Sf If f is small Sf doesn t matter. Concentrate effort on improving frequently occurring events or frequently used 23

Practicing Amdahl s law 1. What is the percentage of time each instruction takes? 2. How much is the total time reduced if the time for FP instructions is reduced by 20%?

24 Practicing Amdahl s law 1. What is the percentage of time each instruction takes? 2. How much is the total time reduced if the time for FP instructions is reduced by 20%? How much is the total speed up? 3. How much is the total time reduced if the time for L/S instructions is reduced by 20%? How much is the total speed up? 4. Can the total time be reduced by 20% by reducing only the time for branch instructions? 5. What s the theoretical speed up limit by reducing the branch instructions time? 24

25 Another exercise 25

Unit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.

This Unit CIS 501: Computer Architecture Unit 4: Performance & Benchmarking Metrics Latency and throughput Speedup Averaging CPU Performance Performance Pitfalls Slides'developed'by'Milo'Mar0n'&'Amir'Roth'at'the'University'of'Pennsylvania'