Response Time Analysis



Similar documents
Response Time Analysis

Response Time Analysis

SolarWinds Database Performance Analyzer (DPA) or OEM?

The Missed Opportunity for Improved Application Performance

Quick Start Guide. Ignite for SQL Server. Confio Software 4772 Walnut Street, Suite 100 Boulder, CO CONFIO.

SQL Server Performance Intelligence

Databases Going Virtual? Identifying the Best Database Servers for Virtualization

SQL Server Query Tuning

WAIT-TIME ANALYSIS METHOD: NEW BEST PRACTICE FOR APPLICATION PERFORMANCE MANAGEMENT

Wait-Time Analysis Method: New Best Practice for Performance Management

A Comparison of Oracle Performance on Physical and VMware Servers

Five Trouble Spots When Moving Databases to VMware

A Comparison of Oracle Performance on Physical and VMware Servers

Five Trouble Spots When Moving Databases to VMware

The Complete Performance Solution for Microsoft SQL Server

One of the database administrators

Enhancing SQL Server Performance

Performance Monitoring with Dynamic Management Views

Proactive Performance Monitoring Using Metric Extensions and SPA

Monitoring Databases on VMware

Performance Counters. Microsoft SQL. Technical Data Sheet. Overview:

PERFORMANCE TUNING FOR PEOPLESOFT APPLICATIONS

Programa de Actualización Profesional ACTI Oracle Database 11g: SQL Tuning Workshop

Business Usage Monitoring for Teradata

Why Alerts Suck and Monitoring Solutions need to become Smarter

Product Review: James F. Koopmann Pine Horse, Inc. Quest Software s Foglight Performance Analysis for Oracle

Copyright 1

Perform-Tools. Powering your performance

Oracle Enterprise Manager 13c Cloud Control

Keep It Simple - Common, Overlooked Performance Tuning Tips. Paul Jackson Hotsos

TUTORIAL WHITE PAPER. Application Performance Management. Investigating Oracle Wait Events With VERITAS Instance Watch

Oracle Database 12c: Performance Management and Tuning NEW

Introduction. AppDynamics for Databases Version Page 1

MONITORING A WEBCENTER CONTENT DEPLOYMENT WITH ENTERPRISE MANAGER

Optimizing Your Database Performance the Easy Way

Oracle Database 11g: SQL Tuning Workshop

Oracle RAC Tuning Tips

Analyzing IBM i Performance Metrics

Oracle Database 11g: SQL Tuning Workshop Release 2

Advanced Performance Forensics

Throwing Hardware at SQL Server Performance problems?

The Evolution of Load Testing. Why Gomez 360 o Web Load Testing Is a

Users are Complaining that the System is Slow What Should I Do Now? Part 1

PATROL From a Database Administrator s Perspective

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Oracle Enterprise Manager 12c New Capabilities for the DBA. Charlie Garry, Director, Product Management Oracle Server Technologies

About Me: Brent Ozar. Perfmon and Profiler 101

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Maximizing Performance for Oracle Database 12c using Oracle Enterprise Manager

Top 10 reasons your ecommerce site will fail during peak periods

A Modern Approach to Monitoring Performance in Production

End Your Data Center Logging Chaos with VMware vcenter Log Insight

SQL Server Performance Tuning for DBAs

2013 OTM SIG CONFERENCE Performance Tuning/Monitoring

Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center

Proactive database performance management

Performance Tuning and Optimizing SQL Databases 2016

PRODUCT OVERVIEW SUITE DEALS. Combine our award-winning products for complete performance monitoring and optimization, and cost effective solutions.

#9011 GeoMedia WebMap Performance Analysis and Tuning (a quick guide to improving system performance)

Oracle Rdb Performance Management Guide

Transaction Performance Maximizer InterMax

PERFORMANCE TUNING IN MICROSOFT SQL SERVER DBMS

What Is Specific in Load Testing?

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

Oracle Database Performance Management Best Practices Workshop. AIOUG Product Management Team Database Manageability

A V a l u e C a s e S t u d y

Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager. Kai Yu, Orlando Gallegos Dell Oracle Solutions Engineering

Oracle Database 11 g Performance Tuning. Recipes. Sam R. Alapati Darl Kuhn Bill Padfield. Apress*

Oracle Database 11g: Performance Tuning DBA Release 2

Monitoring Best Practices for COMMERCE

MONyog White Paper. Webyog

Transaction Monitoring Version for AIX, Linux, and Windows. Reference IBM

Database Maintenance Essentials

PostgreSQL Concurrency Issues

Monitoring and Diagnosing Production Applications Using Oracle Application Diagnostics for Java. An Oracle White Paper December 2007

Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager

Oracle Database 11g: New Features for Administrators 15-1

HP Storage Essentials Storage Resource Management Software end-to-end SAN Performance monitoring and analysis

Top Down Performance Management with OEM Grid Control Or How I learned to stop worrying and love OEM Grid Control John Darrah, DBAK

DBMS Performance Monitoring

White Paper. The Ten Features Your Web Application Monitoring Software Must Have. Executive Summary

Improve SQL Performance with BMC Software

MS SQL Performance (Tuning) Best Practices:

LEVERAGE YOUR INVESTMENT IN DATABASE PERFORMANCE ANALYZER (CONFIO IGNITE) OCT 2015

Application Performance Management. Java EE.Net, Databases Message Queue Transaction, Web Servers End User Experience

Using SQL Monitor at Interactive Intelligence

Identifying Problematic SQL in Sybase ASE. Abstract. Introduction

Using New Relic to Monitor Your Servers

EMC Documentum Performance Tips

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

End-to-end Service Level Monitoring with Synthetic Transactions

Disk Storage Shortfall

Informix Performance Tuning using: SQLTrace, Remote DBA Monitoring and Yellowfin BI by Lester Knutsen and Mike Walker! Webcast on July 2, 2013!

1. This lesson introduces the Performance Tuning course objectives and agenda

The Top 20 VMware Performance Metrics You Should Care About

Once the product is installed, you'll have access to our complete User Guide from the client.

Cognos Performance Troubleshooting

Installation and User Guide

Server & Application Monitor

Solving the Top 5 Virtualized Application and Infrastructure Problems

Transcription:

Response Time Analysis A Pragmatic Approach for Tuning and Optimizing Database Performance By Dean Richards Confio Software 4772 Walnut Street, Suite 100 Boulder, CO 80301 866.CONFIO.1 www.confio.com

Introduction For database administrators the most essential performance question is: how well is my database running? Traditionally, the answer has come from analysis of system counters and overall server health metrics. You might, for example, look at how many physical reads there have been within a specific period of time, the number of writers or lock statistics. Or you might look at dashboards of information about the CPU, disk I/O, memory, storage and the network. Yet, because the primary purpose of a database is to provide end users with a service, none of these counters or metrics provides a relevant and actionable picture of performance. That is why, in recent years, an increasing number of performance experts have recommended a shift from measuring resources consumed to measuring application performance. To accurately assess database performance from the perspective of service provided, the question must become: how much time do end users wait on database response? To answer this question, you need a way to assess what s happening inside the database that can be related to end users. Response time analysis (RTA) is an approach that enables DBAs to measure the time spent on query execution and, therefore, measure impact to end users. Introduced in 1999 as a database performance perspective by Cary Millsap i, and articulated again in 2001 by Craig Shallahamer ii, RTA is a proven, comprehensive way to monitor, manage and tune database performance that is based on the optimal outcome for the end user, whether that user is an application owner, batch processes or end-user consumer. RTA does for DBAs what application performance management (APM) does for IT identify and measure an end-to-end process, starting with a query request from the end user and ending with a query response to the end user, including the time spent at each discrete step in between. Using RTA, you can identify bottlenecks, pinpoint their root causes and prioritize your actions based on the impact to end users. In short, RTA helps you provide better service, resolve trouble tickets faster and free up time to focus your efforts where they have the most impact. This paper describes RTA in detail, and provides an overview of the tools available to perform RTA so that you can monitor and manage database performance by focusing on its impact to end users. Why Aren t Counters and System Health Metrics Enough? When a database isn t performing well, or when web pages are loading too slowly, or price lookups are so slow that customers abandon the page, or batch processes don t complete in time, often the first response is to look at counters and system health metrics to find the answer. Unfortunately, this information is incomplete, sometimes misleading, and rarely easy to correlate to end user experience. 2

On the counter side, you might look at how many physical reads there have been within a specific period of time, the number of writes or lock statistics. This view of performance is, however, limited. Let s say you ve been told an application is running slow. Upon a closer look, you see that the cache hit ratio is only 60%. What does this tell you about the root cause? Is the buffer cache too small? Is there a large full table scan occurring or some other inefficient SQL statement executing? If so, which ones? What impact does this have to the end user? None of these questions is answered by looking at cache hit ratio by itself. On the system health metrics side, the amount of actionable information is equally lacking. Let s say CPU utilization shoots up to 90% on one of your servers, and then stays at that level for a day. Is it something that happened in the server? Is it something that happened in the SQL? If all you look at is that spike in CPU utilization, you might conclude that demand has exceeded server capacity and add a new server. What if that doesn t change the CPU utilization? Did it impact the end user in any way? It wouldn t be the first time that expensive hardware additions didn t solve the problem. You simply can t answer these questions armed with just server health information. A database performance approach focused on measuring impact to end users clearly requires different information. It requires information that exposes what is happening inside the database and quantifies the impact to users. Response Time Analysis is All about Waits, Queues and Time At the core of RTA is the wait event or wait type. When a server process or thread has to wait for an event to complete or resource to become available before being able to continue processing the SQL query, it is called a wait event. For example: moving data to a buffer, writing to disk, waiting on a lock, or writing a log file. Typically, hundreds of wait events must be passed between a query request and response. If an SQL query is waiting on a specific wait event more than is usual, how can you find out? How do you know what is normal? How can you find out why it s waiting? And how do you fix it? That s where RTA comes in. RTA measures time for every wait event for each query, and provides a way to look at this data in context with other queries, and in context of time, so that you can prioritize efforts that have the most impact on the end user. In order for RTA to work, it must do the following: 3

1. Identify the specific wait event that causes a delay. You must measure every wait event individually to be able to isolate potential problems and bottlenecks. 2. Identify the specific query that got delayed. If you know only that the whole instance is delayed, you will not be able to isolate the specific query that is causing a delay or bottleneck. It s important to know that some performance monitoring tools and dashboards present an averaged assessment across multiple SQL statements. If you are looking at averaged data, how can you tell which specific SQL statements are slowing performance? In order to have an RTA approach that works, you must collect performance statistics at the individual SQL statement level. 3. Measure the time impact of the delay. You must measure the wait event time the time to complete the task and pass the work on to the next process, for each wait event. Without this information, you will not be able to completely understand what is causing the bottleneck. Armed with all this data, you can account for each SQL and each resource utilized by an SQL. And then you can accurately pinpoint the root cause of delays, inside or outside the database. Where does the data required for Response Time Analysis come from? There are a number of ways to obtain the performance data required for RTA, including Oracle Enterprise Manager (OEM, sometimes also referred to as Grid Control), Trace files, and Oracle s V$SESSION tables. Each has strengths and also important limitations. For example, most are timeintensive, limited to point-in-time views, and accessible only to highly experienced DBAs. Additionally, continuous performance monitoring tools complement existing database management tools and can greatly simplify and speed RTA. 4

Oracle OEM with AWR and Performance/Tuning Packs: Pros and Cons As an Oracle DBA, the most obvious tool for RTA-based performance management might be Oracle OEM with Automatic Workload Repository (AWR) and Performance/Tuning Packs. Expert OEM users can find the three key elements of RTA query, wait and time but it can take considerable time, expertise and detailed analysis. Developers or managers, for example, likely won t be able to successfully use OEM, and will rely on the DBAs to access and interpret the data for them. Limiting its usefulness even further is the fact that OEM s local collection of performance data places a load on production servers and requires users to have access to the production database. This makes OEM use very restrictive, and not desirable for real-time RTA. Furthermore, OEM isn t even available for Oracle Standard Edition (SE) implementations. Continuous monitoring, via trace Trace produces rich precise information that might not be available in any other way, and does capture all the SQL statements. However, the drawbacks of trace are many. To begin with, tracing is turned on selectively, and it produces no historical/trending information. You must know in advance when and where a problem is going to occur, turn on tracing for that session, and watch to see what happened. This means that trace isn t useful for 24/7 monitoring, for complete performance analysis of production systems, or for proactive performance management. Trace also produces high overhead. If you re using it for problem-solving, it will add to overhead at the worst possible time. In fact, the load that tools like trace put on a database server are one of the reasons some DBAs don t want to do RTA; they mistakenly believe that trace is the only way to get the information needed to do effective RTA. Another sometimes overlooked problem with trace is that it produces almost too much information. You need a skilled DBA and lots of time to accurately sort through it and assess its meaning. This makes it especially inappropriate for developers and others who aren t highly experienced DBAs. Finally, trace can skew your assessment of root cause of performance problems. For example, consider a situation in which you see 95 of 100 sessions are running well, but five sessions have spent 99% of their time waiting for locked rows. If you trace one of the 95 good sessions, you may think you don t have any locking issues, and you will likely spend time trying to tune other SQL queries that may not be so critical. Yet, if you happen to trace one of the five bad sessions, you might think you could fix the locking problem and so reduce the wait time by 99%. Neither assessment is entirely accurate, and any effort you make based on that assessment will fail to address the real cause of the issue. That can quickly lead to a great deal of wasted time and resources. 5

Oracle s response time tables: V$SESSION views Oracle s V$SESSION views also provide wait event information. Some of these tables are point of time views, while others are historical. The more useful of these views are described below. As with OEM and trace, however, there are limitations. None of these views provides continuous information. None provides a consolidated view of performance data. None are easy to use to achieve actionable results. View Description Example V$SESSION Use V$SESSION to find out what sessions are active, what SQLs they are running (SQL_ID), whether that SQL is waiting (actively waiting for a resource), or waited (it did wait but is active now somewhere else, reading data from memory or actually processing on the CPU). V$SESSION_WAIT Note that V$SESSION_WAIT was added to V$SESSION in Oracle 10g and higher. This view includes all the wait events, and shows what the session or query is currently waiting on. V$SQL Using SQL_ID from V$SESSION, this view can give full SQL text, executions, disk reads, etc. V$SQL_PLAN Use PLAN_HASH_VALUE from V$SQL to understand the plan used by Oracle. DBA_OBJECTS Join with ROW_WAIT_OBJ# from V$SESSION. This view tells you what specific objects or tables the SQL is accessing. 6

Continuous performance monitoring tools Continuous performance monitoring tools mark the next evolution of tools available for DBAs implementing RTA. These third-party tools, which typically complement existing database management tools like OEM, aim to simplify performance management by consolidating performance information in one place. However, even these tools may have limitations, and you should carefully evaluate a tool before making an investment. The best continuous performance monitoring tools for RTA will meet the following criteria: Identifies and measures the three key elements needed for RTA if the tool doesn t provide all of these, it just won t work: o Identifies the specific SQL query that got delayed o Identifies the specific bottleneck (wait event) that causes a delay o Shows the time impact of the identified bottleneck It s important to pay particular attention to how the tool collects data, and where it collects data from. Provides continuous 24x7 monitoring with access to historical trend data Doesn t put a load on the system (some of these tools rely on trace to gather performance data, which can create a load on the monitored server) Can be used by non-dbas, like developers and managers, to assess performance (so DBAs can spend less time explaining and more time doing). Be sure to ask these questions: o Is it easy enough to use and understand that a non-dba can use it? o Does it require a user to have access to your production database? Unless the tool meets all these requirements, it will be of limited use in performing RTA. It s Time for Response Time Analysis RTA provides a systematic way to discover and measure performance as it relates to user experience. RTA measures a query end to end, from request to response, all the waits encountered in between, and all the resources used by a query during processing. RTA puts this information into context with other queries, and provides historical trend information to make it possible to identify and fix the bottlenecks with the greatest impact to end users. RTA can transform database tuning from a reactive process done in crisis mode late at night or on the weekend to a proactive practice tied to the end users it serves. 7

About Confio Software Confio Software pioneered a new breed of techniques and technologies required to make RTA work in heavy production environments. Confio Ignite is unique among continuous performance monitoring tools in bringing together in one place all the information needed to accurately identify the wait event, specific query and exact time spent for each bottleneck. In just four clicks, and in visual charts simple enough for developers or managers to understand, Ignite clearly identifies root cause of performance issues, without placing a load on the monitored server. Ignite makes possible proactive RTA that is focused on the end user, even in the most demanding environments. For more information, please visit: http://www.confio.com/performance/oracle/ignite/ Confio Software Boulder, Colorado, USA (303) 938-8282 info@confio.com www.confio.com i Performance Management Myths and Facts, by Cary Millsap. Copy available courtesy of Oracle Corp. at: http://method-r.com/downloads/cat_view/38-papers. ii Response Time Analysis for Oracle Based Systems, by Craig Shallahamer. http://resources.orapub.com/product_p/rta.htm 8