Reducing Maintenance Downtime by 85%: Oracle s Internal Patch Automation and Process Improvements in a Heterogeneous Enterprise Application



Similar documents
Driving Down the High Cost of Storage. Pillar Axiom 600

An Oracle White Paper July Introducing the Oracle Home User in Oracle Database 12c for Microsoft Windows

An Oracle White Paper September Adapting to PeopleSoft Continuous Delivery

APPLICATION MANAGEMENT SUITE FOR ORACLE E-BUSINESS SUITE APPLICATIONS

An Oracle White Paper January A Technical Overview of New Features for Automatic Storage Management in Oracle Database 12c

Introduction. Automated Discovery of IT assets

An Oracle Communications White Paper December Serialized Asset Lifecycle Management and Property Accountability

An Oracle White Paper October BI Publisher 11g Scheduling & Apache ActiveMQ as JMS Provider

Managed Storage Services

March Oracle Business Intelligence Discoverer Statement of Direction

A Framework for Implementing World-Class Talent Management. The highest performing businesses are re-focusing on talent management

An Oracle White Paper September Oracle Database and the Oracle Database Cloud

APPLICATION MANAGEMENT SUITE FOR ORACLE E-BUSINESS SUITE APPLICATIONS

Simplify IT and Reduce TCO: Oracle s End-to-End, Integrated Infrastructure for SAP Data Centers

Performance with the Oracle Database Cloud

The Yin and Yang of Enterprise Project Portfolio Management and Agile Software Development: Combining Creativity and Governance

An Oracle White Paper March Oracle s Single Server Solution for VDI

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

An Oracle White Paper April, Effective Account Origination with Siebel Financial Services Customer Order Management for Banking

An Oracle White Paper August Oracle Service Cloud Integration with Oracle Siebel Service

G Cloud 7 Pricing Document

An Oracle White Paper November Upgrade Best Practices - Using the Oracle Upgrade Factory for Siebel Customer Relationship Management

An Oracle White Paper June Oracle Linux Management with Oracle Enterprise Manager 12c

April Oracle Higher Education Investment Executive Brief

An Oracle White Paper July Accelerating Database Infrastructure Using Oracle Real Application Clusters 11g R2 and QLogic FabricCache Adapters

ORACLE OPS CENTER: VIRTUALIZATION MANAGEMENT PACK

The Role of Data Integration in Public, Private, and Hybrid Clouds

Managing Utility Capital Projects Using Enterprise Project Portfolio Management Solutions

An Oracle White Paper August Higher Security, Greater Access with Oracle Desktop Virtualization

Oracle Hyperion Financial Close Management

Field Service Management in the Cloud

Top Ten Reasons for Deploying Oracle Virtual Networking in Your Data Center

An Oracle White Paper November Oracle Business Intelligence Standard Edition One 11g

An Oracle White Paper May Distributed Development Using Oracle Secure Global Desktop

An Oracle White Paper February Rapid Bottleneck Identification - A Better Way to do Load Testing

An Oracle White Paper May Oracle Database Cloud Service

Improve your Customer Experience with High Quality Information

Accelerating the Transition to Hybrid Cloud with Oracle Managed Cloud Integration Service

THE NEW BUSINESS OF BUSINESS LEADERS. Hiring and Onboarding

ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND

G Cloud 7 Pricing Document

ORACLE PRODUCT DATA HUB

Driving the Business Forward with Human Capital Management. Five key points to consider before you invest

ORACLE S PRIMAVERA CONTRACT MANAGEMENT, BUSINESS INTELLIGENCE PUBLISHER EDITION

Oracle SQL Developer Migration

An Oracle White Paper June, Provisioning & Patching Oracle Database using Enterprise Manager 12c.

An Oracle Technical Article November Certification with Oracle Linux 6

An Oracle White Paper June Oracle Real Application Clusters One Node

An Oracle Best Practice Guide April Best Practices for Implementing Contact Center Experiences

An Oracle Technical Article March Certification with Oracle Linux 7

An Oracle White Paper August Automatic Data Optimization with Oracle Database 12c

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

An Oracle White Paper July Oracle Desktop Virtualization Simplified Client Access for Oracle Applications

Oracle Insurance Revenue Management and Billing ORACLE WHITE PAPER JULY 2014

An Oracle White Paper February Oracle Revenue Management and Billing for Healthcare Payers

An Oracle Technical White Paper June Oracle VM Windows Paravirtual (PV) Drivers 2.0: New Features

An Oracle White Paper December Tutor Top Ten List: Implement a Sustainable Document Management Environment

An Oracle White Paper October Gneis Turns to Oracle to Secure and Manage SIP Trunks

How To Load Data Into An Org Database Cloud Service - Multitenant Edition

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Evolution from the Traditional Data Center to Exalogic: An Operational Perspective

PeopleSoft Enterprise Directory Interface

An Oracle White Paper June Security and the Oracle Database Cloud Service

Virtual Compute Appliance Frequently Asked Questions

ORACLE SYSTEMS OPTIMIZATION SUPPORT

SIX QUESTIONS TO ASK ANY VENDOR BEFORE SIGNING A SaaS E-COMMERCE CONTRACT

ORACLE VM MANAGEMENT PACK

An Oracle White Paper July Oracle Database 12c: Meeting your Performance Objectives with Quality of Service Management

Minutes on Modern Finance Midsize Edition

Monitoring and Diagnosing Production Applications Using Oracle Application Diagnostics for Java. An Oracle White Paper December 2007

PROACTIVE ASSET MANAGEMENT

The Benefits of a Unified Enterprise Content Management Platform

Oracle Primavera Gateway

An Oracle White Paper June Cutting Cost through Consolidation

An Oracle White Paper August Oracle VM 3: Application-Driven Virtualization

An Oracle White Paper October Oracle Data Integrator 12c New Features Overview

An Oracle White Paper February Integration with Oracle Fusion Financials Cloud Service

An Oracle Technical Article October Certification with Oracle Linux 5

June, 2015 Oracle s Siebel CRM Statement of Direction Client Platform Support

Oracle Knowledge Solutions for Insurance. Answers that Fuel Growth

An Oracle White Paper March Managing Metadata with Oracle Data Integrator

An Oracle White Paper June How to Install and Configure a Two-Node Cluster

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Guide to Database as a Service (DBaaS) Part 2 Delivering Database as a Service to Your Organization

Oracle Cloud for Midsize Service Companies. An Oracle Industry Brief

An Oracle White Paper November Backup and Recovery with Oracle s Sun ZFS Storage Appliances and Oracle Recovery Manager

Oracle s BigMachines Solutions. Cloud-Based Configuration, Pricing, and Quoting Solutions for Enterprises and Fast-Growing Midsize Companies

Using Symantec NetBackup with VSS Snapshot to Perform a Backup of SAN LUNs in the Oracle ZFS Storage Appliance

An Oracle Best Practice Guide April Best Practices for Designing Contact Center Experiences with Oracle RightNow CX Cloud Service

An Oracle White Paper January Using Oracle's StorageTek Search Accelerator

An Oracle White Paper June Tackling Fraud and Error

Big Data and Natural Language: Extracting Insight From Text

An Oracle White Paper May The Role of Project and Portfolio Management Systems in Driving Business and IT Strategy Execution

An Oracle Benchmarking Study February Oracle Insurance Insbridge Enterprise Rating: Performance Assessment

An Oracle White Paper September Advanced Java Diagnostics and Monitoring Without Performance Overhead

SOCIAL NETWORKING WITH A PURPOSE. Building Your Referral Machine

An Oracle White Paper. December Cloud Computing Maturity Model Guiding Success with Cloud Capabilities

An Oracle White Paper December Leveraging Oracle Enterprise Single Sign-On Suite Plus to Achieve HIPAA Compliance

Transcription:

Reducing Maintenance Downtime by 85%: Oracle s Internal Patch Automation and Process Improvements in a Heterogeneous Enterprise Application Deployment Including E-Business Suite An Oracle Technical White Paper May 2012

Abstract... 3 New Business, Higher Volume, More Patching... 3 Downtime before Improvements... 3 Identifying Component Causes of Downtime... 4 Reducing Downtime Component Causes... 5 Results... 11 Future Direction... 13 Conclusion... 13

Abstract Oracle internally runs a variety of standard and custom applications. Over time, the maintenance for keeping these applications up to date had become both time-consuming and labor-intensive. Oracle s acquisition of Sun Microsystems exacerbated the problem with a 50% increase in system volume and additional requirements to support the hardware business. This additional load put pressure on a patching window that Oracle IT already regarded as unacceptably long. Beginning in 2009, Oracle IT began an initiative to reduce downtime by automating regular system maintenance and software patching processes for both E-Business Suite and non-e-business Suite applications. The changes made reduced downtime related to software patching by 85%. Most notably, Oracle was able to perform an upgrade of its Global Single Instance of E-Business Suite from 12.1.1 to 12.1.3 with only 9 hours of downtime. This white paper describes Oracle s current internal methods for patch automation and patch optimization in a heterogeneous software environment, quantifies the resulting time savings, and describes the tools used to streamline the internal change control processes. It also recommends best practices that customers can use to improve their patching processes. New Business, Higher Volume, More Patching Before acquiring Sun Microsystems in 2009, Oracle was an enterprise software company with little experience in managing a hardware business. Buying Sun dropped Oracle head-first into the deep end of that pool. An influx of new users increased demand on Oracle s internal systems by over 50 percent, and an entirely new set of requirements arose out of the need to support the hardware line of business. Sun came with over 1000 internal legacy applications, many focused on manufacturing and the hardware supply chain, which could only be consolidated as Oracle s internal solutions were extended to take their place. New systems needed to be implemented within Oracle in order to support the distinct requirements of the hardware business, and systems already in place had to be upgraded to support the increased load. As a result, integrating Sun placed additional strain on a patching window that was already too long. Downtime Before Improvements Before Oracle IT began the effort to improve downtime, general maintenance took over 100 hours every quarter as shown in Figure 1. Major upgrades, such as from one version of E-Business Suite to the next, could alone take more than 48 hours. Although a significant amount of time and money was spent on patching, the financial impact on the rest of the business was much worse than this direct cost. Oracle is a 24/7 global company, and any downtime impeded business across the enterprise.

200 180 160 Downtime (hrs) 140 120 100 80 60 40 20 0 Q3 2008 Q4 2008 Q1 2009 Q2 2009 Q3 2009 Q4 2009 Q1 2010 Q2 2010 Figure 1: Oracle Global Single Instance Downtimes per Quarter, Prior to Downtime Reduction Initiative This impact became much more material with the addition of the Sun hardware business. When systems went down, even for routine maintenance, manufacturing was severely impacted. Likewise, field service could not operate without visibility into the supply chain. Long patching windows predated Sun, but the additional load and increased consequences made it clear to Oracle IT that changes needed to be quickly implemented to reduce downtime. It was mandated that we bring the system maintenance window down from fifteen to three hours per week. Identifying Component Causes of Downtime Because maintenance downtime has multiple causes, Oracle IT began by identifying the factors that contributed the most to downtime and resource consumption at Oracle. Table 1 below summarizes these major contributing factors. Factor Cold Patching Pre and post patching Steps Script performance Large, Infrequent Patch Bundles Description Environments had to be shut down to apply the overwhelming majority of patches. Shutting down and starting up databases and mid tiers in preparation for patching took over 30 minutes of the maintenance window. Post patching steps were performed sequentially. There were no official guidelines for patch developers on how quickly their patching scripts needed to execute. In addition, the current E-25K database server hardware was inadequate at handling the required capacity. Patches were primarily applied in a quarterly bundle containing over 300 patches.

Custom Patches Patch Management Custom patch application required custom scripts and manual steps. The approval process was complicated and the patch management tool used to track patches was inefficient. Table 1: Top Contributors to Patching Downtime at Oracle Once the factors contributing to downtime had been identified, Oracle IT began process improvements to reduce the downtime caused by each factor. The following section provides details. Reducing Downtime Component Causes Reducing Cold Patching The largest single contributor to patching downtime was cold patching. Up until 2009, more than 99% of patches were applied only after shutting down the servers running and supporting the application being patched. But not all of the over 700 patches applied each quarter required cold patching. If more patches could be applied hot, while the affected systems continued to run, then this number could be brought down directly reducing required downtime. In order to help classify a patch as hot or cold, guidelines were developed based on the impact that patches had on the applications and on supporting systems. For example, a patch that was simply delivering new reports could be applied hot, while a patch that updated a database table structure or a critical PL/SQL package needed to go in cold. Based on these guidelines, the bulk of the hot patches could be packaged separately from the cold ones and applied on a weekly basis while systems continued to run. The percentage of cold patches dropped from over 99% in 2009 to less than 60% in 2010 and 2011. Since every patch applied hot rather than cold reduced downtime, downtime declined proportionately. Figure 2: Percentage of Patches Applied Hot vs. Cold in 2009 and 2011 Speeding up Pre and Post Patching Steps During a patching window, supporting systems - the Concurrent Manager (CM), database instances, and application servers had to be shut down before patching and then started up after, adding to downtime. These systems were being shut down in sequence, requiring each shutdown step to wait for the previous one to complete. For example, there are multiple database instances and it could take as long as 10-15 minutes for each instance to shut down. This meant 30 minutes or more of the maintenance windows

were eaten up just to shut down the database instances. The same issue occurred in reverse when starting up and readying supporting systems. In addition, other pre- and post-patching steps had to be performed: including disabling and re-enabling security triggers, removing temporary database tables, and recompiling invalid schema objects. To speed up pre- and post-patching steps, the team identified steps that could either be shortened or performed in parallel instead of sequentially. For example, CM waited for all running processes to finish before it would shut down. So to speed shutdown, the team added a script that terminated all running processes after waiting for a few minutes. The 50 plus application servers were also shut down in parallel instead of sequentially, bringing total shut-down time for these servers to less than 10 minutes. Oracle IT also began doing a shutdown abort of database instances to speed up their shutdown process. Each instance was forced to shut down within one minute. Whereas in the past each instance was taken down only after the previous one had completed shutting down, multiple instances of the databases were now shut down in parallel, typically within 30 seconds of each other. Similarly, restarts of supporting systems and other steps needed to make these systems operational were shifted to being performed in parallel. As a result of these changes, the time taken to complete the pre and post patching steps dropped from over 20 hours a quarter in 2010 to less than 10 hours a quarter in 2011, a reduction that went directly to the bottom line of overall downtime. Table 3 provides a detailed breakdown of the components. Area Q2 2010 Q3 2010 Q4 2010 Q1 2011 Q2 2011 Q3 2011 Q4 2011 Pre patching and Note: only combined pre and post 4 3 3 2 shutdown steps (hrs) statistics are available for these quarters. Post patching and startup 12.5 6.5 7.5 7 steps (hrs) Combined pre and post 27 23 10 16.5 9.5 10.5 9 times (hrs) Table 2: Downtime Caused by Pre- and Post-Patching Steps by Quarter Increasing Patching Frequency Patching efficiency was also improved by the rather counterintuitive process change of patching more often. The older maintenance process had centered on a quarterly release bundle, in which over 300 patches were applied during a single patching window each quarter. Despite the apparent advantages of doing one large patching event per quarter, this process had serious unintended consequences. First, applying such a large bundle requires a large patching window, which can be more disruptive to business operations such as manufacturing and field services than several smaller ones. Second, a single quarterly window reduces business agility and does not allow for incremental changes. Third, patches would sometimes be rushed into the bundle before they were fully ready, to avoid missing these infrequent quarterly windows. If a rushed patch caused problems in the system, emergency patches would then have to be applied to correct the problems causing additional downtime. Finally, applying so many patches simultaneously reduced accountability and made problems difficult to trace when they did occur. Because the quarterly bundles had so many unintended consequences, the team found that they could actually achieve less downtime by patching more often. In the current process, patches are applied during regular weekly patching windows spread over the quarter. Figure 3 shows the more even spread of patches in 2011 when compared to 2010. To accommodate the increased patching frequency, the testing

process had to be made more robust. This was achieved by introducing automated testing of critical application flows using the Oracle Application Testing Suite (OATS). OATS enables definition and management of the application testing process, and validates application functionality. Figure 3: Number of Patches Applied by Week, 2010-2011 Improving Patching Script Performance Downtime also resulted from the poor performance of patch application scripts which, in the absence of official guidelines on tuning, often ran for over 30 minutes each. As part of the downtime reduction initiative, guidelines were put into place requiring patching scripts to be tuned so every job within patches submitted ran in under 10 minutes. This mandate did consume some additional labor for script tuning. However the team considered this labor a reasonable tradeoff since it affected only direct IT costs, while downtime imposed much more significant costs across the company. With scripts tuned to run faster, the actual patching component of downtime was reduced. In addition, tuning of standard scripts benefitted customers who had to apply them later. It should be noted that some of the improvement in script processing speed did not come from script tuning, but rather from faster hardware. During the period of the downtime reduction initiative, the servers that run Oracle s Global Single Instance (GSI) of E-Business Suite were upgraded from a fournode Real Application Cluster (RAC) running Sun Fire E25Ks to a three-node RAC running Sun SPARC Enterprise M9000s. The new M9000 servers provided a significant performance boost compared to the previous E-25K servers. The main drivers for this upgrade were the ability to handle increased load from the Sun acquisition and to improve GSI performance in normal operation. However, as a side benefit, the M9000s did indeed process patching scripts much faster. Script tuning and faster hardware combined to dramatically reduce the time taken for the actual patch application steps. In 2010, patch applications steps consumed over 50 hours per quarter. In 2011, this dropped to 4 hours per quarter. Automating Custom Patching Like most large enterprise software deployments, Oracle s own implementation of E-Business Suite contains custom code and application customizations. These in turn require custom patches. A significant number of EBS patches applied at Oracle were custom. Furthermore, Oracle s internal footprint also

includes a number of non-ebs applications such as Seibel, Agile and Oracle Application Express (APEX). The manual application process for custom patches to EBS and for any patches to non-ebs applications was both time-consuming and labor intensive. Standard patches to EBS had always been applied using an automated tool called AutoPatch. Autopatch applied all bug fixes in a patch, managed version checking, tracked changes and allowed restart capability. But no such capabilities were in place for custom EBS or non-ebs patches, which had to be handexecuted or hand-placed by patching personnel into directories. Aside from using up resource hours, this added a layer of complexity and contributed to errors and quality issues. The team started following the same process to build custom patches as was used for standard patches so that custom patches could be applied using AutoPatch. They also developed functionality similar to AutoPatch into a shell script to automate application of custom non-ebs patches. These two tools allowed Oracle IT to apply custom patches in much the same way as standard ones. Automation of Patch Management Oracle IT also improved the change control process that led up to patching. Oracle had used a tool called called Automated Release Updates (ARU) for many years to automate the process of defining, building, packaging, and distributing patches. For patch management, a tool called Common Patch Request Tool (CPRT) had been used prior to the downtime reduction initiative. CPRT offered limited functionality to track patches and record manual deployment instructions. It included a cumbersome approval process for patches involving manual steps. In addition, approvers were previously designated for each of the 200 applications supported by Oracle IT, and a patch containing updates to several applications required approvals from designated approvers for each of the included applications. This process occurred before patch application, and therefore did not contribute directly to downtime. However it did consume time and resources, reduce accountability, and cause delays in rolling out fixes and new functionality to users. To better manage patching related processes, Oracle IT built a custom Enterprise Resource Planning (ERP) tool called Patch Approval Submission System (PASS) which simplifies and automates patch tracking, approval, downtime management, and reporting. The switch from CPRT to PASS started in November 2010 and was completed in December 2011. Patch type Tools used 2008 2011 EBS custom patch ARU->CPRT->Manual patching ARU->PASS->Autopatch EBS standard patch ARU->CPRT -> Autopatch ARU->PASS->Autopatch Non-EBS custom patch ARU->CPRT->Manual patching ARU->PASS->Automated tool that mimics Autopatch behavior Non-EBS standard patch ARU->CPRT->Manual patching ARU->PASS->Zip file with instructions for patching team Table 3: Types of Patches and Patching Tools Used PASS automatically manages the workflow required to move a request to approval and to patching. It allows developers request target environments and patching windows for each ARU patch. At every step of the patching process, from identification of the issue to approval to actual implementation, PASS provides accountability and tracks all pieces of information on who was doing what, and when. Table 3 shows the tools used to automate the patching processes for EBS and non-ebs patches.

Create patch request Submit patch for approval Approval Patching Team picks up request Patching Team applies patch Requester tests patch Figure 4: Patching Workflow Steps Tracked in PASS PASS has also streamlined the approval process. Previously, each of the 200 supported applications had a designated approver. A patch containing files impacting different applications required approval from designees for each of those applications, making the process cumbersome. With PASS, the number of approvers for any given patch is reduced to a handful. This reduction in approvers is possible because of another process change a radical reduction in the number of people authorized to submit patches. In the old process, developers who requested patches also submitted them. In the new PASS process, a separate layer of submitters has been designated to ensure the patch quality and performance before submission. This adds a layer of accountability and eliminates the need for a large number of approvers. This smaller number of submitters is also required to provide more information when submitting patches. Table 4 below shows the questions submitters must respond to when entering a patch into PASS. 1. Describe the issues being addressed by this patch 2. Identify risks associated with the patch application 3. Indicate the tracking number of the bug being fixed 4. Indicate a target date for patching of the production environment 5. Enter the name of the developer or IT reviewer 6. Identify the files that are changed or impacted 7. Briefly describe the code changes for the files 8. Confirm that the patch has been tested either manually or using PASS 9. Indicate when the patch was last tested in a test environment 10. Note the patch execution times in each previous environment Table 4: Information Required at Patch Submission The new ARU/PASS process also provides efficient merging of patches so that they can be applied in a single package. As part of its Multi Language Support (MLS), EBS supports eleven languages and patches often need to be built for each language. By merging these into one package, common steps such as maintaining file versions and updating history tables can be performed once, rather than multiple times.

Figure 5: Screen shot of Oracle s Patch Approval Submission System (PASS) ARU and PASS are custom tools that Oracle IT continues to use and extend because of their long experience with them. PASS has been customized to work well with the internal patch generation and source control tools that are used by the Oracle s Applications Product Development group and its extension, the Applications IT group. However, Oracle customers can use many of the same capabilities in the form of the Application Change Management Pack (ACMP) and the Application Management Pack (AMP); both included in Enterprise Manager. ACMP is an end to end change management solution that works with a variety of source control applications and allows for the automated deployment of standard and custom patches across different environments. AMP has patch management functionalities similar to PASS. It is a system management solution for centralized management of multiple environments and allows for the proactive monitoring of requests and workflows, for reporting on history and trends, and for automation of repetitive Database Administration (DBA) tasks. Figure 6: Screen Shot of Patch Manager in Application Change Management Pack (ACMP)

Results Oracle IT s downtime reduction initiative reduced maintenance downtime by 85%; from 104.5 hours in Q2 2010 to less than 15 hours in each of three consecutive quarters from Q2 2011 through Q4 2011. This reduction occurred although the number of patches remained essentially the same for most quarters, and even increased in Q2 2011. Figure 6 below shows the trend of downtime reduction along with the number of patches applied each quarter. 120 1400 Time (hours) 100 80 60 40 20 1200 1000 800 600 400 200 Number of patchesa applied 0 Q2 2010 Q3 2010 Q4 2010 Q1 2011 Q2 2011 Q3 2011 Q4 2011 0 Shutdown/Startup time Patching time Total time Patches Figure 7: GSI Downtimes by Quarter Table 5 below provides the underlying data and some additional detail, including the number of planned outages and actual patching events. It should be noted that tracking the detail of time consumed in preand post-patching steps was initiated as part of the downtime reduction initiative. Therefore a breakdown of hours into pre and post is not available for all quarters. Area Q1 2010 Q2 2010 Q3 2010 Q4 2010 Q1 2011 Q2 2011 Q3 2011 Q4 2011 Patching (hrs) 77.5 32.5 20 18 4.5 4 4 Pre patching and 4 3 3 2 shutdown steps (hrs) Post patching and 12.5 6.5 7.5 7 startup steps (hrs) Combined pre and 27 23 10 16.5 9.5 10.5 9 post times (hrs) Total downtime (hrs) 172.5 104.5 55.5 30 34.5 14 14.5 13 Table 5: Breakdown of GSI Downtime by Quarter (detail data not available for some quarters) It is not simple to allocate percentages or hours of downtime reduction to all the factors addressed in Oracle s downtime reduction initiative. Reductions in factors such as hot patching and the pre- and postpatching steps can be quantified to the minute, with no dependencies to cloud the issue. As shown in Figure 7, the increase in hot patching contributed to a 66% decrease in patching downtime and the improvements in pre and post patching steps to a 19% reduction. Improvements in other factors

contributed to another 15% decrease in downtime, but are much more interdependent and their benefits harder to accurately allocate. For example, both script performance tuning and an upgrade of Oracle s Global Single Instance to faster hardware reduced the downtime associated with patching script execution. However, since the hardware upgrade was initiated to improve overall GSI performance and was scheduled independently of the downtime reduction initiative, the team could not accurately separate the effects of the hardware upgrade from those of script tuning. A pure research organization would have made one of these changes at a time and quantified exactly how many hours of downtime reduction could be specifically be attributed to script tuning vs. upgraded hardware. Since Oracle IT s primary mission is to support the business, sometimes multiple improvements must be made simultaneously despite the inevitable confounds this produces. The exact impacts of process improvements, such as increasing patch frequency, are similarly difficult to break out. Doing so would require putting exact numbers to downtime caused specifically by patches that were rushed in order to make the cutoff and by the reduction in accountability caused by very large bundles; not a straightforward calculation. Hot Patching 19% 15% 66% Pre and Post patching steps Patch tuning, hardware upgrades and other factors Figure 8: Contribution of Factor Improvements on Downtime Reduction Initiative Despite the difficulties of exactly allocating a portion of the downtime reduction to each factor, it is clear that all of the factors cited in Table 1 contributed substantially to Oracle s downtime and improvements in each factor contributed to the overall 85% reduction. Table 5 below revisits the factors that contributed to Oracle s previously high downtime, and recaps the actions taken to improve them. Factor Cold Patching Pre and post patching Steps Script performance Patch Frequency Description Percentage of patches applied hot increased, from less than 1% in 2009 to over 40% in 2011. Systems are shut down and started back up in parallel. As a result, pre and post patching times went down from over 20 hours per quarter in 2010 to less than 10 hours per quarter in 2011 Patching scripts are now tuned to run in under 10 minutes. Database server hardware upgrades also helped speed up script execution times. Smaller patch sets are applied weekly as opposed to large quarterly bundles. This has improved patch quality and brought down instances of follow on patchings required to correct bad patches.

Custom Patches Patch Management Custom patching, EBS and non-ebs, has been automated to reduce resource requirements and inefficiencies. PASS provides improved patch management, from initial request to approval to patching. ARU and PASS have allowed efficient merging of patches and more accountability in the patch approval process. Table 6: Actions and Results of Oracle Downtime Reduction Initiative by Downtime Component Cause Future Product Enhancements to Further Reduce Downtime In addition to the practices and tools described in this paper, help is also on the way from Product Development that will extend the concept of hot patching substantially. EBS 12.2, to be released in the near future, is expected to reduce downtime further by performing most patching activities while the system remains online and available to users. So, for example, a user will be able to continue entering an expense report while the Payables module is being patched. Online patching will be achieved through an Editioning feature that creates a patchable copy of the production system, applies patches to that copy, and then switches users to it. Patches will be applied to a secondary file system and a separate copy of all database code objects affected by the patches will be maintained. Once patches have been successfully applied, users will be moved over to the patched editions of the file system and the database. Patching downtime will result solely from restarting the middle tier services, and is expected to be measured in minutes rather than hours. Oracle IT will report on its results from these new capabilities once they are adopted. Conclusion The improvements made by Oracle to its patching processes reduced quarterly system maintenance downtimes by 85% - from over 100 hours during the first quarter of 2010 to less than 15 hours in the last quarter of 2011. In addition, these improvements enabled Oracle to perform an upgrade of its Global Single Instance of E-Business Suite from 12.1.1 to 12.1.3 with only 9 hours of downtime. It is our recommendation that Oracle customers with sizable deployments and a need to reduce scheduled downtime consider adopting the process changes and solution patterns that enabled Oracle IT to achieve these results. In addition, Oracle IT recommends that customers begin evaluating the downtime reduction capabilities planned for E-Business Suite release 12.2.

Reducing Maintenance Downtime by 85%: Oracle s Internal Patch Automation and Process Improvements in a Heterogeneous Enterprise Application Deployment Including E-Business Suite May 2012 Authors: Kishan Agrawal, Operation Excellence Manager; Vinay Dwivedi, Principal Product Manager; Jeffrey Pease, Vice President; Dave Stephens, Group Vice President Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Copyright 2012, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark licensed through X/Open Company, Ltd. 0611 Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 oracle.com