Performance and Capacity Management Best Practices

Similar documents
Performance Management for

IBM z13 Software Pricing Announcements

System z Batch Network Analyzer Tool (zbna) - Because Batch is Back!

Can You Teach a New Capacity & Performance Specialist Old Tricks? Hindsight and Insight

Understanding Linux on z/vm Steal Time

Capacity Planning: Where the Mistakes Are Session: 11598

CA MICS Resource Management r12.6

Unifying IT How Dell Is Using BMC

The Consolidation Process

Hot Tips from Cheryl & Frank

We will discuss Capping Capacity in the traditional sense as it is done today.

Monitoring IBM HMC Server. eg Enterprise v6

zpcr Capacity Sizing Lab Part 2 Hands-on Lab

IBM Tivoli Storage Productivity Center (TPC)

CA MICS Resource Management r12.7

/ Operating Systems I. Process Scheduling. Warren R. Carithers Rob Duncan

WHITE PAPER. SAS IT Intelligence. Balancing enterprise strategy, business objectives, IT enablement and costs

Capacity Analysis Techniques Applied to VMware VMs (aka When is a Server not really a Server?)

High performance ETL Benchmark

Real world experiences for CMDB Success

CA Insight Database Performance Monitor for DB2 for z/os

CICS Transactions Measurement with no Pain

Challenges of Capacity Management in Large Mixed Organizations

IBM Sterling Transportation Management System

IBM SmartCloud Workload Automation

Practical Performance Understanding the Performance of Your Application

How to Build a Service Management Hub for Digital Service Innovation

Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving

IBM System z9 and ^ zseries Software Pricing Reference Guide

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

Predictive Analytics And IT Service Management

BROCADE PERFORMANCE MANAGEMENT SOLUTIONS

Capacity Management Analytics V2.1

CA SYSVIEW Performance Management r13.0

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

zenterprise The Ideal Platform For Smarter Computing Developing Hybrid Applications For zenterprise

Technology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips

Identifying & Implementing Quick Wins

SHARE in Pittsburgh Session 15591

FAQ: HPA-SQL FOR DB2 MAY

The Total Cost of Ownership (TCO) of migrating to SUSE Linux Enterprise Server for System z

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

An Introduction to PRINCE2

The Truth Behind IBM AIX LPAR Performance

Welcome to today's webinar: How to Transform RMF & SMF into Availability Intelligence

IBM Rational Performance Tester for z/os simplifies test creation, load generation, and analysis processes

ITIL Intermediate Qualification: Service Offerings and Agreement. Webinar presentation 8 May 2009

Introduction to Analytical Modeling

Performance Optimization Guide

Disk Storage Shortfall

Benefits and Potential Drawbacks to Implementing SAP as a Hosted Solution; Run SAP Like a Factory Factory Controls

Handouts for teachers

Product Brief SysTrack VMP

Datasheet FUJITSU Cloud Monitoring Service

Performance Best Practices Guide for SAP NetWeaver Portal 7.3

Redbooks Paper. Local versus Remote Database Access: A Performance Test. Victor Chao Leticia Cruz Nin Lei

Why Relative Share Does Not Work

Event-Driven and Dynamic Process Automation. Enabling the Real Time Enterprise with Redwood Software

Integrated Application and Data Protection. NEC ExpressCluster White Paper

SuSE Linux High Availability Extensions Hands-on Workshop

Hard Partitioning and Virtualization with Oracle Virtual Machine. An approach toward cost saving with Oracle Database licenses

<Insert Picture Here> Oracle Database Support for Server Virtualization Updated December 7, 2009

z/vm Capacity Planning Overview

Server Consolidation with SQL Server 2008

x64 Servers: Do you want 64 or 32 bit apps with that server?

Virtualization and the U2 Databases

IBM System z Software Pricing Guide Share Europe

ASF: Standards-based Systems Management. Providing remote access and manageability in OS-absent environments

MS SQL Performance (Tuning) Best Practices:

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Value-4IT. GSE UK Conference 2010: Session GC z/os Software Cost Optimization - Several Options for TCO Reduction

Workflow extended notifications

IBM i Version 7.2. Security Service Tools

Federation and a CMDB

Users are Complaining that the System is Slow What Should I Do Now? Part 1

Performance Isolation of a Misbehaving Virtual Machine with Xen, VMware and Solaris Containers

Assessing Your Information Technology Organization

IT SERVICES MANAGEMENT Service Level Agreements

OLAP Services. MicroStrategy Products. MicroStrategy OLAP Services Delivers Economic Savings, Analytical Insight, and up to 50x Faster Performance

Manage your IT Resources with IBM Capacity Management Analytics (CMA)

Berlin Mainframe Summit. Java on z/os IBM Corporation

Load Testing and Monitoring Web Applications in a Windows Environment

WHAT YOU SHOULD KNOW ABOUT MERGING YOUR PRACTICE

EECS 678: Introduction to Operating Systems

Measures of WFM Team Success

About Advent One. Contents. 02 What we do. 03 Infrastructure Services. 04 Cloud and Managed Services. 07 Hosting Desktop. 08 Phone.

Q & A From Hitachi Data Systems WebTech Presentation:

Transcription:

SHARE Boston BOF @Lunch & Learn August 2, 2010 Performance and Capacity Management Best Practices Ivan Gelb, GIS Corp. Phone:732-303-1333 E-mail: ivan@gelbis.com

TRADEMARKS The following are trade or service marks of the IBM Corporation: CICS, CICS TS, CICSPlex, DB2, IBM, MVS, OS/390, z/os, Sysplex, Parallel Sysplex. Any omissions are purely unintended. 2002-10 Systems Corp. URL: www.gelbis.com Phone: 732-303-1333 No part of this material can be reproduced by any means without prior written permission from the author and with proper attribution displayed. 2010 Systems Corp. - www.gelbis.com 2

MOTHER OF ALL DISCLAIMERS (MOAD ) All of the information in this document is tried and true. However, this fact alone cannot guarantee that you can get the same results at your place and with your skills. In fact, some of this advice can be hurtful if it is misused and misunderstood. As with all kinds of analysis, anything you may hear or read can be understood and misunderstood in many ways that may seem contradictory to you. Gelb Information Systems Corporation, Ivan Gelb and any one found anywhere assume no responsibility for this information s accuracy, completeness or suitability for any purpose. Anyone attempting to adapt these techniques to their own environments anywhere do so completely at their own risk. 2010 Systems Corp. - www.gelbis.com 3

Agenda Best Practices Critical Success Factors Selected Hot Topics Q & A Anytime Ask your questions at any time for best results today. 2010 Systems Corp. - www.gelbis.com 4

What Did I Bring For You? 2010 Systems Corp. - www.gelbis.com 5

Best Practices / Recommendations - 1 Always communicate with management in common business terms: orders, deliveries, customers served, and effect on total cost. Always avoid technical jargon and acronyms when communicating with business people. 2010 Systems Corp. - www.gelbis.com 6

Good For ALL Report Example Best: Capacity of our configuration increased from 100 to 123 orders/second without any increase in spending. Very Good: The order entry application can handle 23% more volume without any increase in spending. 150 100 50 0 Project Results Before After Orders/Sec 2010 Systems Corp. - www.gelbis.com 7

Best Practices / Recommendations - 2 Service Level Agreements (SLA) are the foundation for effective Capacity Planning (CP) and Performance Management (PM). Say no more. Say no more. Wink, wink. *One of the Monthy Python Characters The SLA s three key statements: Service level, Stated business activity level, and Consequences if SLA not met 2010 Systems Corp. - www.gelbis.com 8

Best Practices / Recommendations - 3 Effective Capacity Planning (CP) and Performance Management (PM) will yield required service level quantity and quality for least total cost. Design WLM service policy to mirror business activity this enables the most effective CP & PM activities. You will know when you need to do something, and what you may need to pay closer attention to. 2010 Systems Corp. - www.gelbis.com 9

Best Practices / Recommendations - 4 Given less time and more data to analyze, choose your tools and techniques so YOUR effectiveness is improved Practice routinely scheduled z/os Workload Manager (WLM) service policy health checks. Ask: Is it still working as intended? 2010 Systems Corp. - www.gelbis.com 10

Best Practices / Recommendations - 5 Choose tools and techniques that enable analysis of each workload independently and in combination with present and future workloads For capacity planning studies, insure that you isolate workloads not just along business importance but also based on key attributes that affect scalability: physical disk I/O intensity, virtual storage needs, use of z/os services, total CPU time in applications code, network services, etc...etc 2010 Systems Corp. - www.gelbis.com 11

Free (i.e. zpcr) My Favorite Tools Recycled from Borrowed for free Tested, tried and true 2010 Systems Corp. - www.gelbis.com 12

Critical Success Factors (CSF) The enablers Business services mapped to all IT components Business data Technical data Service Level Agreements (SLA) Known IT Total Cost of Ownership (TCO) Established value of IT services ITIL? Six Sigma? Or something like them 2010 Systems Corp. - www.gelbis.com 13

The Enablers Leadership (and the politics of it all) People with skills Governance Tactical plans Strategic plans Resources Tools and techniques 2010 Systems Corp. - www.gelbis.com 14

Technical Enablers Complete synergy between all IT Service Management (ITSM) components Symbiotic relationships among: Quality assurance / Change management Load /Stress testing Performance management Capacity planning Finance Boundaries are powerful disablers! 2010 Systems Corp. - www.gelbis.com 15

Business and Technical Data Current and planned business activity IT services metrics / business unit of work collected during critical periods of activity: Processor Disk I/O All other I/O Required SLA Required Service Level Objectives (SLO) to meet the SLA 2010 Systems Corp. - www.gelbis.com 16

Example of Costs Presentation Source: July 22, 2010, zenterprise Launch, John Shedletsky, IBM Corp. 2010 Systems Corp. - www.gelbis.com 17

Selected Hot Topics Variable Workload License (VWLC) Charging z/os Workload Manager (WLM) PR/SM Considerations The 80/20 of Disk I/O Analysis 2010 Systems Corp. - www.gelbis.com 18

Variable Workload License (VWLC) Charging Method for Software Audience Poll: 1. Sub-capacity licensed now with IBM? 2. Sub-capacity licensed now with other z- Software Vendors? 2010 Systems Corp. - www.gelbis.com 19

VWLC Overview - 1 Variable workload license (VWLC) charging method available in USA since March 2001 for selected IBM software products. Examples: z/os, COBOL, CICS, DB2, CICS, IMS, MQSeries plus over 25 more Started sub-capacity software licensing trend Software license capacity can be dramatically lower than installed hardware capacity. Concept moving ahead very slowly, or not at all, in the independent software vendors (ISV) world. 2010 Systems Corp. - www.gelbis.com 20

VWLC Overview - 2 Basis for sub-capacity of VWLC products is LPAR utilization Monthly charge based on highest rolling 4 hour average by product summed for LPARs w. software present in them Product isolation into LPARs for software capacity planning is a potentially cost saving activity 5 15% monthly software cost savings are possible LPAR s total capacity may be capped via PR/SM to comply with software license agreement 2010 Systems Corp. - www.gelbis.com 21

VWLS and zaap Example Illustration Source: z890 and z990 zaap What it Can Do for You, By Kathy Walsh, IBM Corp. 2010 Systems Corp. - www.gelbis.com 22

Workload Manager (WLM) Audience Poll: 1. Does WLM Goal mode deliver the service levels you hoped for, and protect the most important work? 2. Do you schedule WLM service policy checkups at regular interval, or you wait until..you must! 2010 Systems Corp. - www.gelbis.com 23

WLM Advice - 1 Recommendation: Create service classes aligned with the business importance of the work within them. Example: A service class can be, and we recommend that it should be, a single critical CICS or IMS transaction. Etc 2010 Systems Corp. - www.gelbis.com 24

WLM Advice - 2 Recommendation: Create resource groups for any workload you wish to control regardless of processor utilization level. Example: Service class can be limited to maximum of 1 service unit / second rate (this is the sleeper hold of WLM) 2010 Systems Corp. - www.gelbis.com 25

WLM Role in CP and PM WLM is single most critical success factor (CSF) for CP and PM Insure that critical business workloads are captured in service policy so they are easy to observe and analyze. WLM exercises control on following: WHAT? CPU access priority CPU time limits I/O performance Enclaves for DDF, stored procedures, etc Dynamic batch initiators Storage paging HOW? Task dispatch priority guided by importance and service level goal Defined via resource groups Workload Priority propagation to the I/O controller & PAV-s (parallel access vol.) Coded min/max service level definitions Goal and resource driven controls Isolation to protect working set size 2010 Systems Corp. - www.gelbis.com 26

WLM Definitions Do-s and Don t-s Service Definition Coefficients Percentile Response Time Average Response Time Velocity Goals 2010 Systems Corp. - www.gelbis.com 27

Service Definition Coefficients Recommended service definition coefficients: MSO = 0.0 CPU = 1.0 SRB = 1.0 IOC = 1.0 or less by orders of 10 (0.1 or 0.01; IBM recommends 0.5) Note 1: Potential impact on chargeback algorithms if they use MSO service units in their calculations Note 2: Non-zero MSO value will cause unstable performance under most conditions and regardless of key factors such as CPU or I/O activity 2010 Systems Corp. - www.gelbis.com 28

Percentile Response Time The recommended way to manage the business critical CICS, DB2, and IMS production work Stated as: 90% of transactions with < 1 sec. Resp. If properly defined, prevents service problems caused by long running or never ending transactions Region level goals may be lower WLM CPU overhead than CICS Transaction level goals, but??? Just what has to be managed? What are the SLA terms? 2010 Systems Corp. - www.gelbis.com 29

Average Response Time Can work if workload is homogeneous (this is rare indeed!) different units of work require very similar amounts of computer resources and similar service goals Stated as: ALL transactions < 1 sec. AVG. Resp. Problem: Fooled by long running transactions ending in the interval 2010 Systems Corp. - www.gelbis.com 30

Velocity Goals Execution velocity is an abstract mathematical description with no objectively measurable metric. --John Arwe, WLM Developer at IBM Velocity goals do not determine the actual CPU dispatching priority Application systems velocities fluctuate severely due to factors like work mix, total utilization, service policy, virtual storage management activities, non-zero WLM MSO service definition coefficient, etc 2010 Systems Corp. - www.gelbis.com 31

Velocity Goals 2 Recommendation: Use for nontransactional, or seldom- never-ending work Recommendation: Use for work that needs a limiter Recommendation: Consider use of resource group with velocity goals to impose an absolute limit if needed for vwlc Velocity goals may require lower WLM CPU to manage, but response time goals provide better overall CP and PM tools 2010 Systems Corp. - www.gelbis.com 32

Case Study: The Butler Did It Capacity plan blamed for very unstable performance If CPU utilization increased to over 95% during any 30 minute period, DB2 response time would begin to wildly fluctuate. CICS, DB2 involved Significant DB2 activity generated directly from Internet as well as CICS regions 2010 Systems Corp. - www.gelbis.com 33

Case Study: The Butler Did It 2 Some of the evidence: CPU activity reports from various sources Showed that utilization was at 100% a lot of the time Degradation analysis reports from various incidents of degradation showed virtually every task within the system as THE suspect cause of the problem IO activity reports did not show any unusual activity between the good v. the bad response time periods 2010 Systems Corp. - www.gelbis.com 34

Case Study: The Butler Did It 3 So who done it? The Butler of course! In plain view! WLM Service policy did it! All service classes regardless of importance, had velocity goals, and The sum of velocity goals of the active service class periods exceeded processor capacity The fix: Introduced response time goals for some service classes Used CPU Critical attribute for importance 1 work service classes Reduced velocity goals of lower importance work 2010 Systems Corp. - www.gelbis.com 35

PR/SM Considerations If PR/SM overhead greater than 1.5% or so, try to figure out what is causing it and is it worth it. Recommendation: Minimize number of LPARS The ISSUE! Some important LPAR s performance suffers or some lower importance LPAR customers complain whenever PR/SM enforces the specified LPAR weights (a. k. a. CPU share) When does it happen? What to do? 2010 Systems Corp. - www.gelbis.com 36

The 80/20 of Disk I/O Analysis Our experience shows IO activity tuning is 95/05 rather than an 80/20 proposition Most benefits achieved within 5% of the selection list candidates from the many files Examples: Data & Index tables with highest total time in use Volumes at or near practical capacity limits Transaction with highest total disk I/O time/unit-of- Work (UOW) 2010 Systems Corp. - www.gelbis.com 37

Case Study: Analyze This I/O Plan Name Total Elapsed Total CPU CPU / Elapsed Total Run Total I/O Time @25% I/O Time Saved P09GI0032 120 18 15% 100 12,000-2,550 P09GI0003 240 44 18% 100 24,000-4,900 P09GI0009 80 48 60% 1000 80,000-8,000 P09GI0018 310 53 17% 100 31,000-6,425 What order would you focus your I/O tuning efforts? Why? How many votes for the sequence of 18, 3, 32, 9? How many votes for the sequence of 9, 18, 3, 32? 2010 Systems Corp. - www.gelbis.com 38

Need / Want to Know More Start at: www.ibm.com/servers/eserver/zseries/ Large Systems Performance Reference: http://www-1.ibm.com/servers/ eserver/zseries/lspr/ HOT TOPICS a z/os newsletter: www.ibm.com/servers/s390/os390/bkserv/ hot_topics.html Computer Measurement Group (CMG): www.cmg.org SHARE: www.share.org 2010 Systems Corp. - www.gelbis.com 39

Any More Time for? Lunch! 2010 Systems Corp. - www.gelbis.com 40