Programma della seconda parte del corso Introduction Reliability Performance Risk Software Performance Engineering Layered Queueing Models Stochastic Petri Nets New trends in software modeling: Metamodeling, MOF/QVT, UML 2 1 SOFTWARE PERFORMANCE ENGINEERING 2 1
Introduction A systematic, quantitative approach to cost-effectively constructing systems that meet performance objectives Focus on both process and techniques Modeling as part of software lifecycle 3 The Holy Graal Requirements Performance bounds, scalability Design Architecture selection Implementation Performance understanding Execution Software Model Dynamic optimization Maintenance Capacity planning, static optimization 4 2
SPE Process performance risk critical cases performance scenarios objectives verify & validate models performance model resource requirements performance acceptable evaluate feasible infeasible modify/create scenarios modify product concept revise objectives 5 Model Lifecycle Software Model Resource Requirements System Model Early design Rough Rough Estimate None Detailed Design Detailed Estimates Yes Implementation Yes Revised & Measured Yes Post-implementation Yes Measured Yes 6 3
Software Execution Graphs C1 C2 L1 P1 P2 C3 C4 Mean, best-,worst-case scenarios Static analysis L1, P1, P2 data dependencies Data dependencies typically are functions of inputs, example: L1(# of employees) P1(# of women) Level of abstraction varies T x = t C ( t + P1 t + P 2 t ) + L1 1 C 2 C 3 C 4 7 Software Execution Graphs Web Server Web Server WS.Checkout Business Logic BL.Checkout Business Logic ClientRequest BL.Checkout CalculateOrder DB update RenderPage SendEmail DB Server Mail Server ReplyWS 8 4
Resource Requirements C1 L1 Node C1 Devices CPU (Kinstr.) Disk (I/Os) Network (Nmsg) C2 P1 P2 C3 C4 Best Mean 10 13 3 4 4 7 Worst 15 6 8 9 Evaluation Process C1 v C1 hm( v C1 ) t x C2 L1 P1 C3 Resource Usage Vector flops, language operations, instructions, memory references, etc Hardware Model statistical, analytical, simulation P2 C4 T x = hm ( v ) + L ( hm ( v ) + P 1 hm ( v ) + P hm ( v )) 1 2 C 1 C 2 C 3 C 4 10 5
Software Resource Requirements Types of resources: CPU usage SQL operations File I/O Messages Authentications Middleware calls Inter-process communication User delays Work units Define classes of operations Think about the effect on hardware Multiple hardware usage Low # of resources in early design, increase latter Requires cooperation of experts 11 Computer Resource Requirements: Overhead Matrix Devices CPU Disk Delay Network Quantity 2 2 1 1 Units Sec Phys. I/O Sec. Msgs WorkUnit 0.01 I/O 0.005 1 Msgs 0.004 1 1 Delay 1 Service Time 1 0.002 1 0.05 12 6
System Execution Models Software model does not consider resource contention System execution model outputs: Resource contention metrics Sensitivity analysis to workload parameters Bottleneck analysis Interference of background loads Queuing Network Model (QNM) You better find a tool for this! 13 A QNM Example check availability WorkUnits DB Msgs 5 2 0 Enter Exit calculate sum WorkUnits DB Msgs 3 1 0 Disk return html WorkUnits DB Msgs 6 2 1 CPU Network Device Visits Service Time CPU 4211 Kins Disk 20 I/Os Network 1 msg CPU all.0276 Disk 20.4500 Outputs: throughput, utilization, residence time, queue lengths, response times Network 1.1203 14 7
Measurement Usage in SPE System understanding Similar system measurements to understand target system Model specification Workload data & resource usage Model validation & verification Model updates Represent software changes in model Software performance evaluation 15 Types of Measurement Data Workload data (e.g. frequency of transactions) Data characteristics (e.g. records in DB) Execution characteristics Path characteristics (e.g. loop iterations) Software resource usage (e.g. DB queries) Processing overhead (DB queries to disk I/O) Computer system usage Scenario response time (e.g. trans. delay) Scenario throughput (e.g. trans. served/sec) Resource utilization, throughput, queue lengths 16 8
90 80 70 60 50 40 30 20 10 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr SPE ED: Overview Performance Engineering Services Target systems: distributed, web-based, client-server, mainframes Users: performance engineers, software architects Paradigm: Software Performance Engineering (SPE) Technology: analytics and simulation (CSIM) Recently extended to support Distributed Object Technology (process composition, component communication & coordination) 17 SPE ED: Process Initial estimates Software Model System Model Performance Visualization CSIM Export to RDBMS 18 9
SPE ED: Example SEG 19 LAYERED QUEUEING MODELS 20 10
Introduction Execution Graphs and flat Queueing Models are not suited to deal with client-server mechanisms and layered architectures Layered Queuing Models take into account contention for software servers and devices 21 Basic idea PS P3 PS P1 TM=2 ST=[4,0] Z=0 Client V=3 E1 S1 V=3 V=2 TM=5 ST=[0,4] Z=10 V=3 S2 E1 V=1 TM=1 ST=[5,0] Z=0 P2 PS Client S1 S2 S1 S2 P1 P2 S3 S4 P3 SUBMODEL 1 SUBMODEL 2 TM=1 ST=[4,0] Z=0 E1 S3 E1 S4 TM=1 ST=[2,0] Z=0 S3 S4 P4 SUBMODEL 3 P4 PS 22 11
LQN as extension of QN LQN as an extension of QN models both software tasks (rectangles) and hardware devices (circles) represents nested services (a server is also a client to other servers) software components have entries corresponding to different services arcs represent service requests (synchronous and asynchronous) multi-servers used to model components with internal concurrency LQN limitations Approximate solution Unsuited architectural models (e.g., peer-to-peer) cliente ClientT service1 service2 Appl Query1 Disk1 entries Query2 Disk2 task Client CPU DB device DB CPU DB CPU 23 Types of LQN Requests a) Synchronous Client [s1, s2] Server Client synchronous message Server busy waiting phase1 (s1) reply phase2 (s2) idle included services b) Asynchronous Client [s1, s2] Server Client asynchronous message Server busy phase1 (s1) phase2 (s2) idle included services c) Forwarding Client Client synchronous message waiting [s1, s2] Server1 Server1 busy phase1 (s1) phase2 (s2) idle [t1, t2] Server2 Server2 busy forwarding idle phase1 (t1) reply to the original client phase2 (t2) idle 24 12
Local Wks 1..m Client Local e1 e2remote Client 1..n LQN extensions: activities, Internet fork/join Web Proc Remote e5 Wks a2 e3 e4 Server Web a4 a1 & [e4] a3 ecomm Secure ecomm Proc Server Proc DB SDisk e6 Disk e7 DB Proc DB 25 STOCHASTIC PETRI NETS 26 13
Definition Petri Nets (PN) are a graphical paradigm for the formal description of the logical interactions among parts the flow of activities in complex systems PN are particularly suited to model: Concurrency and Conflict Sequencing, conditional branching and looping Synchronization Sharing of limited resources Mutual exclusion 27 Petri Nets vs Time The original PN did not convey any notion of time. For performance analysis it is necessary to introduce the duration of the events associated to PN transitions. Timed model were subsequently extensively explored, following two main lines: Random durations : Deterministic or interval: Stochastic PN (SPN) Timed PN (TPN) 28 14
Definitions A Petri net (PN) is a bipartite directed graph consisting of two kinds of nodes: places and transitions Places typically represent conditions within the system being modeled Transitions represent events occurring in the system that may cause change in the condition of the system Arcs connect places to transitions and transitions to places (never an arc from a place to a place or from a transition to a transition) 29 Example of a PN t1 p1 p2 t2 p1 resource idle p2 resource busy t1 task arrives t2 task completes 30 15
Example of a PN p3 t1 p1 p2 p1 resource idle p2 resource busy t1 task arrives t2 task completes t2 p3 user 31 Definition of PN A PN is a 5-tuple (P,T,I,O,M) P T I O M set of places set of transitions input arcs output arcs marking 32 16
Arcs Input arcs are directed arcs drawn from places to transitions, representing the conditions that need to be satisfied for the event to be activated Output arcs are directed arcs drawn from transitions to places, representing the conditions resulting from the occurrence of the event 33 Places Input places of a transition are the set of places that are connected to the transition through input arcs Output places of a transition are the set of places to which output arcs exist from the transition 34 17
Tokens Tokens are dots (or integers) associated with places; a place containing tokens indicates that the corresponding condition holds Marking of a Petri net is a vector listing the number of tokens in each place of the net m (m 1 m 2 m P ) ; P = # of Places 35 Enabling Transition When input places of a transition have the required number of tokens, the transition is enabled. An enabled transition may fire (event happens) removing one token from each input place and depositing one token in each of its output place. 36 18
Enabling & Firing of Transitions up t_fail fires up t_fail fires up t_repair t_fail t_repair t_fail t_repair t_fail t_repair fires t_repair fires down down down A 2-processor failure/repair model 37 Example of PN 38 19
Concurrency (or Parallelism) 39 Synchronization 40 20
Limited Resources 41 Producer/consumer 42 21
Producer/consumer with limited buffer 43 Mutual exclusion 44 22
Extensions of PN models arc multiplicity inhibitor arcs priority levels enabling functions (guards) 45 Arc Multiplicity An arc cardinality (or multiplicity) may be associated with input and output arcs, whereby the enabling and firing rules are changed as follows: Each input place must contain at least as many tokens as the cardinality of the corresponding input arc. m p When the transition fires, it removes as many tokens from each input place as the cardinality of the corresponding input arc, and deposits as many tokens in each output places as the cardinality of the corresponding output arc. 46 23
Inhibitor Arc pi tk pj Inhibitor arcs are represented with a circleheaded arc. The transition can fire iff the inhibitor place does not contain tokens. 47 Inhibitor Arc 48 24
An Example: Before or cardinality of the output arc 49 An Example: After or cardinality of the output arc 50 25
Priority levels A priority level can be attached to each PN transition. The standard execution rules are modified in the sense that, among all the transitions enabled in a given marking, only those with associated highest priority level are allowed to fire. 51 Enabling Functions An enabling function (or guard) is a boolean expression composed with the PN primitives (places, trans, tokens). The enabling rule is modified in the sense that beside the standard conditions, the enabling function must evaluate to true. pi tk pj (tk) = #P1<2 & #P2=0 52 26
Stochastic Petri Nets (SPN) Petri nets are extended by associating time with the firing of transitions, resulting in timed Petri nets. A special case of timed Petri nets is stochastic Petri net (SPN) where the firing times are considered random variables. 53 SPN: A Simple Example p1. t1 t2 Server Failure/Repair λ p2 p1 p2. µ Reachability graph t1 10 01 t2 CTMC λ 10 01 µ 54 27
SPN: Poisson Process PP with rate λ SPN model λ RG = CTMC λ λ λ 0 1 2... 55 SPN: M/M/1 Queue M/M/1 λ µ SPN model λ µ RG = CTMC λ λ λ 0 1 2... µ µ µ 56 28
Generalized SPN Sometimes when some events take extremely small time to occur, it is useful to model them as instantaneous activities SPN models were extended to allow for such modeling by allowing some transitions, called immediate transitions, to have zero firing times The remaining transitions, called timed transitions, have exponentially distributed firing times 57 Generalized SPN The enabling rules are modified: if both an immediate and a timed transition are enabled in a marking, immediate transition has higher priority. If more than one immediate transition is enabled in a marking, then the conflict is resolved by assigning firing probabilities to the immediate transitions. p 1-p T t t1 t2 Immediate transition t is enabled! Transition t1 & t2 will fire with p and (1-p). 58 29
Measures of Reliability & Performance Solving the model means evaluating the (transient / steady state) probability vector over the state space (markings). However, the modeler wants to interact only at the PN: the analytical procedure must be completely transparent to the analyst. There is a need to define the output measures at the PN level, in term of the PN primitives. 59 Measures of Reliability & Performance Output measures defined at the PN level. Probability of a given condition on the PN Time spent in a marking Mean (first) passage time Distribution of tokens in a place Expected number of firing of a PN trans (throughput) 60 30
Solving models with SPN The use of SPN requires only the topology of the PN, the firing rates of the transitions and the specification of the output measures. All the subsequent steps, which consist in: generation of the reachability graph generation of the associated Markov chain; transient and s.s. solution of the Markov chain; evaluation of the relevant process measures. must be completely automated by a computer program, thus making transparent to the user the associated mathematics. 61 Example: Multiprocessor with failure Number of processors: n Single repair facility is shared by all processors A reconfiguration is needed after a covered fault A reboot is required after an uncovered fault 62 31
Assumptions: The failure rate of each processor is γ The repair times are exponentially distributed with mean 1/τ A processor fault is covered with probability c The reconfiguration times and the reboot times are exponentially distributed with parameter δ and β, respectively 63 GSPN Model for Multiprocessor GSPN M odel of a M ultiprocessor 64 32