Challenges for Worst-case Execution Time Analysis of Multi-core Architectures

Similar documents
Designing Predictable Multicore Architectures for Avionics and Automotive Systems extended abstract

1. Memory technology & Hierarchy

Chapter 1 Computer System Overview

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Intel X38 Express Chipset Memory Technology and Configuration Guide

OpenSPARC T1 Processor

Motivation: Smartphone Market


Putting it all together: Intel Nehalem.

Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide

OC By Arsene Fansi T. POLIMI

Memory Systems. Static Random Access Memory (SRAM) Cell

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

361 Computer Architecture Lecture 14: Cache Memory

A Dual-Layer Bus Arbiter for Mixed-Criticality Systems with Hypervisors

Computer Architecture

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

Pentium vs. Power PC Computer Architecture and PCI Bus Interface

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

PikeOS: Multi-Core RTOS for IMA. Dr. Sergey Tverdyshev SYSGO AG , Moscow

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May ILP Execution

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

VLIW Processors. VLIW Processors

Energy-Efficient, High-Performance Heterogeneous Core Design

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers

Multithreading Lin Gao cs9244 report, 2006

VHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU

TIME PREDICTABLE CPU AND DMA SHARED MEMORY ACCESS

IBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr Teruzzi Roberto matr IBM CELL. Politecnico di Milano Como Campus

Operating Systems. Virtual Memory

A Lab Course on Computer Architecture

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

AES Power Attack Based on Induced Cache Miss and Countermeasure

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

CHAPTER 7: The CPU and Memory

Intel 965 Express Chipset Family Memory Technology and Configuration Guide

CS/COE1541: Introduction to Computer Architecture. Memory hierarchy. Sangyeun Cho. Computer Science Department University of Pittsburgh

HIPEAC Segregation of Subsystems with Different Criticalities on Networked Multi-Core Chips in the DREAMS Architecture

The Classical Architecture. Storage 1 / 36

A N. O N Output/Input-output connection

Switch Fabric Implementation Using Shared Memory

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

Naveen Muralimanohar Rajeev Balasubramonian Norman P Jouppi

Exploring the Design of the Cortex-A15 Processor ARM s next generation mobile applications processor. Travis Lanier Senior Product Manager

SOC architecture and design

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

21152 PCI-to-PCI Bridge

Computer Systems Structure Main Memory Organization

Memory Basics. SRAM/DRAM Basics

Computer Architecture

Byte Ordering of Multibyte Data Items

What is a System on a Chip?

A Mixed Time-Criticality SDRAM Controller

Mixed-Criticality: Integration of Different Models of Computation. University of Siegen, Roman Obermaisser

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Parallel Computing 37 (2011) Contents lists available at ScienceDirect. Parallel Computing. journal homepage:

Worst-Case Execution Time Prediction by Static Program Analysis

Memory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality

CPU Shielding: Investigating Real-Time Guarantees via Resource Partitioning

DDR4 Memory Technology on HP Z Workstations

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu

Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory

İSTANBUL AYDIN UNIVERSITY

High Performance Processor Architecture. André Seznec IRISA/INRIA ALF project-team

SPARC64 VIIIfx: CPU for the K computer

Introduction to RISC Processor. ni logic Pvt. Ltd., Pune

Multicore partitioned systems based on hypervisor

The Big Picture. Cache Memory CSE Memory Hierarchy (1/3) Disk

Design Cycle for Microprocessors

COS 318: Operating Systems. Virtual Memory and Address Translation

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

Central Processing Unit

CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont.

<Insert Picture Here> T4: A Highly Threaded Server-on-a-Chip with Native Support for Heterogeneous Computing

IA-64 Application Developer s Architecture Guide

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

What is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus

Making Dynamic Memory Allocation Static To Support WCET Analyses

POSIX. RTOSes Part I. POSIX Versions. POSIX Versions (2)

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

Configuring and using DDR3 memory with HP ProLiant Gen8 Servers

Enterprise Applications

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

Central Processing Unit (CPU)

Introducción. Diseño de sistemas digitales.1

The Microarchitecture of Superscalar Processors

This Unit: Multithreading (MT) CIS 501 Computer Architecture. Performance And Utilization. Readings

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

RAM & ROM Based Digital Design. ECE 152A Winter 2012

ADVANCED COMPUTER ARCHITECTURE

Concept of Cache in web proxies

With respect to the way of data access we can classify memories as:

LSN 2 Computer Processors

Parallel Processing and Software Performance. Lukáš Marek

HyperThreading Support in VMware ESX Server 2.1

Microprocessor & Assembly Language

Transcription:

Challenges for Worst-case Execution Time Analysis of Multi-core Architectures Jan Reineke @ saarland university computer science Intel, Braunschweig April 29, 2013

The Context: Hard Real-Time Systems Safety-critical applications: Avionics, automotive, train industries, manufacturing Side airbag in car Side airbag in car Reaction in < 10 msec Crankshaft-synchronous tasks Reaction in < 45 microsec Embedded controllers must finish their tasks within given time bounds. Developers would like to know the Worst-Case Execution Time (WCET) to give a guarantee. Jan Reineke, Saarland 2

The Timing Analysis Problem Embedded + Software? Timing Requirements Microarchitecture Jan Reineke, Saarland 3

What does the execution time depend on? The input, determining which path is taken through the program. The state of the hardware platform: l Due to caches, pipelining, speculation, etc. Interference from the environment: l External interference as seen from the analyzed task on shared busses, caches, memory. Simple CPU Memory Jan Reineke, Saarland 4

What does the execution time depend on? The input, determining which path is taken through the program. The state of the hardware platform: l Due to caches, pipelining, speculation, etc. Interference from the environment: l External interference as seen from the analyzed task on shared busses, caches, memory. Complex CPU (out-of-order Simple execution, branch CPU prediction, etc.) L1 Cache Memory Main Memory Jan Reineke, Saarland 5

What does the execution time depend on? The input, determining which path is taken through the program. The state of the hardware platform: l Due to caches, pipelining, speculation, etc. Interference from the environment: l External interference as seen from the analyzed task on shared busses, caches, memory. Complex Complex CPU CPU (out-of-order Simple execution,... branch CPU prediction, etc.) Complex CPU L1 Cache L1 Cache L1 Cache L2 Memory Cache Main Main Memory Jan Reineke, Saarland 6

Example of Influence of Microarchitectural State x=a+b; LOAD LOAD ADD r2, _a r1, _b r3,r2,r1 PowerPC 755 Courte Jan Reineke, Saarland 7

Example of Influence of Corunning Tasks in Multicores Radojkovic et al. (ACM TACO, 2012) on Intel Atom and Intel Core 2 Quad: up to 14x slow-down due to interference on shared L2 cache and memory controller Jan Reineke, Saarland 8

Challenges 1. Modeling How to construct sound timing models? 2. Analysis How to precisely & efficiently bound the WCET? 3. Design How to design microarchitectures that enable precise & efficient WCET analysis? Jan Reineke, Saarland 9

The Modeling Challenge Microarchitecture? Timing Model Timing model = Formal specification of microarchitecture s timing Incorrect timing model à possibly incorrect WCET bound. Jan Reineke, Saarland 10

Current Process of Deriving Timing Model Microarchitecture? Timing Model Jan Reineke, Saarland 11

Current Process of Deriving Timing Model Microarchitecture? Timing Model Jan Reineke, Saarland 12

Current Process of Deriving Timing Model Microarchitecture? Timing Model Jan Reineke, Saarland 13

Current Process of Deriving Timing Model Microarchitecture? Timing Model Jan Reineke, Saarland 14

Current Process of Deriving Timing Model Microarchitecture? Timing Model à Time-consuming, and à error-prone. Jan Reineke, Saarland 15

Current Process of Deriving Timing Model Microarchitecture? Timing Model à Time-consuming, and à error-prone. Jan Reineke, Saarland 16

1. Future Process of Deriving Timing Model Microarchitecture VHDL Model Timing Model Jan Reineke, Saarland 17

1. Future Process of Deriving Timing Model Microarchitecture VHDL Model Timing Model Derive timing model automatically from formal specification of microarchitecture. à Less manual effort, thus less time-consuming, and à provably correct. Jan Reineke, Saarland 18

1. Future Process of Deriving Timing Model Microarchitecture VHDL Model Timing Model Derive timing model automatically from formal specification of microarchitecture. à Less manual effort, thus less time-consuming, and à provably correct. Jan Reineke, Saarland 19

1. Future Process of Deriving Timing Model Microarchitecture VHDL Model Timing Model Derive timing model automatically from formal specification of microarchitecture. à Less manual effort, thus less time-consuming, and à provably correct. Jan Reineke, Saarland 20

2. Future Process of Deriving Timing Model Microarchitecture Perform measurements on hardware Infer model Timing Model Jan Reineke, Saarland 21

2. Future Process of Deriving Timing Model Microarchitecture Perform measurements on hardware Infer model Timing Model Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 22

2. Future Process of Deriving Timing Model Microarchitecture Perform measurements on hardware Infer model Timing Model Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 23

2. Future Process of Deriving Timing Model Microarchitecture Perform measurements on hardware Infer model Timing Model Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 24

2. Future Process of Deriving Timing Model Microarchitecture Perform measurements on hardware Infer model Timing Model Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 25

Proof-of-concept: Automatic Modeling of the Cache Hierarchy Cache Model is important part of Timing Model Can be characterized by a few parameters: l ABC: associativity, block size, capacity l Replacement policy B = Block Size Tag Data Tag Data Tag Data A = Associativity Tag Tag Data Data Tag Tag Data Data... Tag Tag Data Data Tag Data Tag Data Tag Data N = Number of Cache Sets chi [Abel and Reineke, RTAS 2013] derives all of these parameters fully automatically. Jan Reineke, Saarland 26

Example: Intel Core 2 Duo E6750, L1 Data Cache Misses 90000 80000 70000 60000 50000 40000 L1 Misses 30000 20000 10000 0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950 Size Jan Reineke, Saarland 27

Example: Intel Core 2 Duo E6750, L1 Data Cache Misses 90000 80000 70000 60000 50000 40000 L1 Misses 30000 20000 10000 0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950 Size Capacity = 32 KB Jan Reineke, Saarland 28

Example: Intel Core 2 Duo E6750, L1 Data Cache Way Size = 4 KB Misses 90000 80000 70000 60000 50000 40000 L1 Misses 30000 20000 10000 0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950 Size Capacity = 32 KB Jan Reineke, Saarland 29

Replacement Policy Approach inspired by methods to learn finite automata. Heavily specialized to problem domain. Jan Reineke, Saarland 30

Replacement Policy Approach inspired by methods to learn finite automata. Heavily specialized to problem domain. Discovered to our knowledge undocumented policy of the Intel Atom D525: d x a b c d e f d c a b e f x e d c a b More information: http://embedded.cs.uni-saarland.de/chi.php Jan Reineke, Saarland 31

Modeling Challenge: Future Work Extend automation to other parts of the microarchitecture: Translation lookaside buffers, branch predictors Shared caches in multicores including their coherency protocols Out-of-order pipelines? Jan Reineke, Saarland 32

The Analysis Challenge Microarchitecture! Timing Model? Precise & Efficient Timing Analysis Consider all possible program inputs Consider all possible initial states of the hardware WCET H (P ):= max i2inputs max ET H(P, i, h) h2states(h) Jan Reineke, Saarland 33

The Analysis Challenge Consider all possible program inputs Consider all possible initial states of the hardware WCET H (P ):= max i2inputs max ET H(P, i, h) h2states(h) Explicitly evaluating ET for all inputs and all hardware states is not feasible in practice: There are simply too many. è Need for abstraction and thus approximation! Jan Reineke, Saarland 34

The Analysis Challenge: State of the Art Component Caches, Branch Target Buffers Analysis Status Precise & efficient abstractions, for LRU [Ferdinand, 1999] Not-so-precise but efficient abstractions, for FIFO, PLRU, MRU [Grund and Reineke, 2008-2011] Complex Pipelines Precise but very inefficient; little abstraction Major challenge: timing anomalies Shared resources, e.g. busses, shared caches, DRAM No realistic approaches yet Major challenge: interference between hardware threads à execution time depends on corunning tasks Jan Reineke, Saarland 35

Timing Anomalies Timing Anomaly = Local worst-case does not imply Global worst-case Resource 1 A D E Resource 2 C B Resource 1 A D E Resource 2 B C Scheduling Anomaly Jan Reineke, Saarland 36

Timing Anomalies Timing Anomaly = Local worst-case does not imply Global worst-case Branch Condition Evaluated Cache Hit A Prefetch B - Miss C Cache Miss A C Speculation Anomaly Jan Reineke, Saarland 37

The Design Challenge Wanted: Multi-/many-core architecture with No timing anomalies à precise & efficient analysis of individual cores Temporal isolation between cores à independent/incremental development & analysis and high performance! Jan Reineke, Saarland 38

Approaches to the Design Challenge At the level of individual cores: Simple in-order pipelines, with static or no branch prediction Scratchpad Memories or LRU Caches Softwarecontrolled caches Jan Reineke, Saarland 39

Approaches to the Design Challenge For resources shared among multiple cores: Temporal partitioning, e.g. l TDMA arbitration of buses, shared memories l Thread-interleaved pipeline in PRET Spatial partitioning, e.g. l Partition shared caches l Partition shared DRAM à Temporal isolation Jan Reineke, Saarland 40

Design Challenge: Predictable Pipelining from Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 2007. Jan Reineke, Saarland 41

Pipelining: Hazards from Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 2007. Jan Reineke, Saarland 42

Forwarding helps, but not all the time LD R1, 45(r2) DADD R5, R1, R7 BE R5, R3, R0 ST R5, 48(R2) Unpipelined The Dream F D E M W F D E M W F D E M W F D E M W F D E M W F D E M W F D E M W F D E M W F D E M W The Reality F D E M W Memory Hazard F D E M W Data Hazard F D E M W Branch Hazard Jan Reineke, Saarland 43

Solution: PTARM Thread-interleaved Pipelines [Lickly et al., CASES 2008] + T1: F D E M W F D E M W T2: F D E M W F D E M W T3: F D E M W F D E M W T4: F D E M W F D E M W T5: F D E M W F D E M W Each thread occupies only one stage of the pipeline at a time à No hazards; perfect utilization of pipeline à Simple hardware implementation (no forwarding, etc.) à Each instruction takes the same amount of time à Temporal isolation between different hardware threads Drawback: reduced single-thread performance Jan Reineke, Saarland 44

Design Challenge: DRAM Controller Translates sequences of memory accesses by Clients (CPUs and I/O) into legal sequences of DRAM commands l Needs to obey all timing constraints l Needs to insert refresh commands sufficiently often l Needs to translate physical memory addresses into row/column/ bank tuples CPU1... CPU1 Interconnect + Arbitration Memory Controller DRAM Module I/O Jan Reineke, Saarland 45

Dynamic RAM Timing Constraints DRAM Memory Controllers have to conform to different timing constraints that define minimal distances between consecutive DRAM commands. Almost all of these constraints are due to the sharing of resources at different levels of the hierarchy: addr+cmd DIMM Bit line Word line Transistor Capacitor Bank Row Address Row Decoder DRAM Array Sense Amplifiers and Row Buffer Column Decoder/ Multiplexer command chip select address Control Logic Mode Register Address Register DRAM Device Row Address Mux Refresh Counter Bank Bank Bank Bank I/O Gating I/O Registers + Data I/O data 16 data 64 data 16 data 16 data 16 data 16 x16 Device x16 Device x16 Device x16 Device x16 Device x16 Device x16 Device x16 Device chip select 0 chip select 1 Rank 0 Rank 1 Needs to insert refresh commands sufficiently often Rows within a bank share sense amplifiers Banks within a DRAM device share I/O gating and control logic Different ranks share data/address/ command busses Jan Reineke, Saarland 46

General-Purpose DRAM Controllers Schedule DRAM commands dynamically Timing hard to predict even for single client: l Timing of request depends on past requests: Request to same/different bank? Request to open/closed row within bank? Controller might reorder requests to minimize latency l Controllers dynamically schedule refreshes No temporal isolation. Timing depends on behavior of other clients: l They influence sequence of past requests l Arbitration may or may not provide guarantees Jan Reineke, Saarland 47

General-Purpose DRAM Controllers Thread 1 Thread 2 Load B1.R3.C2 Load B2.R4.C3 Store B4.R3.C5 Load B3.R3.C2 Load B3.R5.C3 Store B2.R3.C5 Arbitration Load B1.R3.C2 Load B3.R3.C2 Load B2.R4.C3 Store B4.R3.C5 Load B3.R5.C3 Store B2.R3.C5? Memory Controller Jan Reineke, Saarland 48

General-Purpose DRAM Controllers Thread 1 Thread 2 Load B1.R3.C2 Load B2.R4.C3 Store B4.R3.C5 Load B3.R3.C2 Load B3.R5.C3 Store B2.R3.C5 Arbitration Load B3.R3.C2 Load B1.R3.C2 Load B2.R4.C3 Load B3.R5.C3 Store B4.R3.C5 Store B2.R3.C5? Memory Controller Jan Reineke, Saarland 49

General-Purpose DRAM Controllers Load B1.R3.C2 Load B1.R4.C3 Load B1.R3.C5 Memory Controller RAS B1.R3 CAS B1.C2 RAS B1.R4 CAS B1.C3 RAS B1.R3 CAS B1.C5? RAS B1.R3 CAS B1.C2 CAS B1.C5 RAS B1.R4 CAS B1.C3 Jan Reineke, Saarland 50

PRET DRAM Controller: Three Innovations [Reineke et al., CODES+ISSS 2011] Expose internal structure of DRAM devices: l Expose individual banks within DRAM device as multiple independent resources CPU1... CPU1 Interconnect + Arbitration PRET DRAM Controller DRAM Bank DRAM Module DRAM Module DRAM Module I/O Defer refreshes to the end of transactions l Allows to hide refresh latency Perform refreshes manually : l Replace standard refresh command with multiple reads Jan Reineke, Saarland 51

PRET DRAM Controller: Exploiting Internal Structure of DRAM Module l Consists of 4-8 banks in 1-2 ranks Share only command and data bus, otherwise independent l Partition banks into four groups in alternating ranks l Cycle through groups in a time-triggered fashion Rank 0: Bank 0 Bank 1 Bank 2 Bank 3 Rank 1: Bank 0 Bank 1 Bank 2 Bank 3 Jan Reineke, Saarland 52

PRET DRAM Controller: Exploiting Internal Structure of DRAM Module l Consists of 4-8 banks in 1-2 ranks Share only command and data bus, otherwise independent l Partition banks into four groups in alternating ranks l Cycle through groups in a time-triggered fashion Rank 0: Bank 0 Bank 1 Bank 2 Bank 3 Successive accesses to same group obey timing constraints Reads/writes to different groups do not interfere Rank 1: Bank 0 Bank 1 Bank 2 Bank 3 Jan Reineke, Saarland 53

PRET DRAM Controller: Exploiting Internal Structure of DRAM Module l Consists of 4-8 banks in 1-2 ranks Share only command and data bus, otherwise independent l Partition banks into four groups in alternating ranks l Cycle through groups in a time-triggered fashion Rank 0: Bank 0 Bank 1 Bank 2 Bank 3 Successive accesses to same group obey timing constraints Reads/writes to different groups do not interfere Rank 1: Bank 0 Bank 1 Bank 2 Bank 3 Provides four independent and predictable resources Jan Reineke, Saarland 54

Conventional DRAM Controller (DRAMSim2) vs PRET DRAM Controller: Latency Evaluation latency [cycles] 3,000 2,000 1,000 0 Varying interference for fixed transfer size: 0 0.5 1 1.5 2 2.5 3 Interference [# of other threads occupied] 4096B transfers, conventional controller 4096B transfers, PRET controller 1024B transfers, conventional controller 1024B transfers, PRET controller Figure 9: Latencies of conventional and PRET memory con- average latency [cycles] 3,000 2,000 1,000 0 Varying transfer size at maximal interference: Conventional controller PRET controller 0 1,000 2,000 3,000 4,000 transfer size [bytes] Figure 10: Latencies of conventional and PRET memory con- More information: http://chess.eecs.berkeley.edu/pret/ Jan Reineke, Saarland 55

Emerging Challenge: Microarchitecture Selection & Configuration Embedded Software with? Timing Requirements Family of Microarchitectures = Platform Jan Reineke, Saarland 56

Emerging Challenge: Microarchitecture Selection & Configuration Embedded Software with? Timing Requirements Choices: Processor frequency Sizes and latencies of local memories Latency and bandwidth of interconnect Presence of floatingpoint unit Family of Microarchitectures = Platform Jan Reineke, Saarland 57

Emerging Challenge: Microarchitecture Selection & Configuration Embedded Software Timing Requirements Choices: Processor frequency Sizes and latencies of local memories Latency and bandwidth of interconnect Presence of floatingpoint unit? Select a microarchitecture that a) satisfies all timing requirements, and with b) minimizes cost/size/energy. Family of Microarchitectures = Platform Jan Reineke, Saarland 58

Conclusions Challenges in modeling analysis design remain. Jan Reineke, Saarland 59

Conclusions Challenges in modeling analysis design remain. Progress based on automation abstraction partitioning has been made. Jan Reineke, Saarland 60

Conclusions Challenges in modeling analysis design remain. Progress based on automation abstraction partitioning has been made. Thank you for your attention! Jan Reineke, Saarland 61