Brochure Reduce DRP errors through IT process automation HP Operations Orchestration
Overview Overview Compliance requirements and basic business continuity considerations have made it imperative for organizations like yours to develop, implement, and test continuously disaster recovery procedures. These procedures must also be tested every time a major change is implemented on an application, which escalates the number of tests that must be done on each application. Testing disaster recovery procedures is a resource-intensive task and involves multiple subject matter experts (SMEs) from many different disconnected organizations within IT. A typical test for a large organization can involve dozens of people on multiple conference calls for up to a full day, and can result in rapidly escalating costs. In addition to automating the testing of disaster recovery procedures, HP Operations Orchestration (HP OO) software can also drive significant efficiencies to reduce errors in disaster recovery planning (DRP) by performing a number of repetitive and tedious tasks. Institutionalizing the disaster recovery procedure into an HP OO workflow not only helps communicate and document the procedure but also decreases dependencies on specific individuals or groups. Finally, by automating a large number of disaster recovery tasks, HP OO can help drive down costs of performing DRP. For example: one midsize HP OO customer reduced the failover time for an internal application from 45 minutes down to less than 9 minutes. 1 1 This customer example was collected by HP in February 2011. 2 of 20
What is DRP? What is DRP? DRP is the ability of an organization to offer continuity of service and support for its customers and to maintain its viability before, after, and during a business disruption event. DRP is becoming an even more critical aspect for IT with organizations differentiating themselves with technology increasing operational risks, and with today s technology getting more and more complex, there are just more things that can go wrong. IT Service Continuity Planning, Business Continuity Planning, and Business Process Contingency Planning are other common terms used interchangeably with DRP. Continued 3 of 20
What is DRP? Current state of the industry In the initial disaster recovery survey conducted in October 2007, Forrester Research and the Disaster Recovery Journal (DRJ) jointly surveyed 250 disaster recovery decision-makers and influencers at global businesses. The study outlined that almost 54 percent of the respondents expected a recovery time of more than 10 hours for mission-critical applications and data in the event of a primary data center site failure. For business-critical and noncritical applications, 72 percent and 77 percent respectively expected the recovery time to be greater than 10 hours. Another notable statistic, more than 47 percent of the respondents on the DRJ/Forrester survey felt somewhat prepared to not prepared in their ability to recover in the event of a site failure. 2 That first study provided a baseline for business continuity preparedness that can now compare to the yearly studies that followed, to see how business continuity maturity and preparedness are trending across time. In the last 2011 study, Forrester Research found that companies had made tremendous progress in their business continuity planning where 72 percent reported that they had established programs in place. 3 Despite technology advancements making disaster recovery less expensive and easier to implement, the survey found that reported recovery times have actually lengthened though the amount of data loss during disasters and other major outages, however, went down slightly. Forrester s survey also found that companies are concerned about the increasing reliance on technology. When asked if they felt the overall level of risk was increasing and if so, what was driving the increase, respondents replied that the number one driver was reliance of technology (48 percent). The increasing complexity on business processes coupled with a reliance of third parties further complicates the ability to cleanly recover an end-to-end business process. Continued 2 Source: What your business can learn about disaster recovery from financial institutions, by Stephanie Balaouras with Simon Yates, Rachel Batiancila, and Rachel A. Dines, Forrester, July 2008. 3 Source: The state of business continuity preparedness, by Stephanie Balaouras, Forrester, January 2012. 4 of 20
What is DRP? Why do businesses care about disaster recovery? Disasters are possible everywhere around your business environment. Natural disasters such as earthquake, tsunami, hurricane, fire, and flood can happen at any minute. Even human error and hardware and software failure are unpredictable. Proactive planning is the best way. Every business, regardless of size, needs disaster recovery protection. When disaster strikes, your data must be accessible from an offsite location and for an unknown length of time. Staff may be limited and time will be of the essence, so you need to know you re protected. And every minute your system is down, the financial implications grow. The opportunity for IT organizations is to enhance the success rate and effectiveness of the disaster recovery processes by automating as much of the process as possible. Automate for quality and reap the productivity benefits. 5 of 20
Pain points Pain points in DRP To stay disaster ready, an organization needs to keep its disaster recovery procedures updated and in tune with the changes in its IT infrastructure and business environment. In addition, the plans need to be tested regularly to verify that things work as planned. IT complexity: testing disaster recovery procedures for a medium to large organization are an extremely resource-intensive exercise. This exercise must be performed not just once at the inception of the disaster recovery plan, but also every time a major upgrade is applied to a critical system. As the number of systems managed by IT increases, so does the number of upgrades and the requirement for disaster recovery testing the upgrades. Resources: every disaster recovery exercise (real or simulated) requires representation from every system being tested. This includes architects, administrators, network personnel, developers, and data center operators. The exercise requires all aspects of IT systems, processes, and people to be tested for preparedness. Communication mechanisms, both local and remote, need to be set up; participants need to follow established procedures so all understands their role; and all systems and contingencies need to be planned and defined. Service-level agreement (SLA): IT organizations have recovery time SLAs with internal customer organizations in the event of a catastrophe. Considering the complexity involved in coordinating systems, people and processes to do a successful failover or recovery, even a single misstep can lead to noncompliance with the internal SLAs. This can lead to business disruption, loss of revenue, and productivity or addition of costs to complex business processes. The opportunity for IT organizations is to enhance the success rate and reduce the resources expended on DRP by automating as much of the process as possible. Continued 6 of 20
Pain points Components of DRP There are many standards, methodologies and compliance regulations around DRP (such as ISO 22301:2012 and ISO 27031:2011 and the like) and most of them cover the following essential phases: Business impact analysis: identify the financial impact over time resulting from loss of a business process; includes identification and ranking of plausible events that could disrupt business processes Plan development: define plan that includes people, processes, and systems that will be involved in a disaster recovery event Training, testing, and exercising: train all participants in the disaster recovery procedures, and then test plans on an ongoing basis through simulated exercises Continuous improvement: review and update the disaster recovery plan based on testing and exercising. While IT process automation tools can play an important role in all phases of DRP, they can add most value in the testing and prevention phases. 7 of 20
How can HP OO help? How can HP OO help in DRP? HP OO helps automate standard IT tasks and integrates critical management systems in the data center. By executing and reporting on ready-to-use workflows of HP OO, you can enable standardization of IT processes, increase in quality, reduction of errors, and cost savings. Any disaster recovery or failover exercise requires a number of tasks that need to be performed in a very specific sequence. However, these tasks can span a number of different IT domains (server, network, storage, and others) and can require a number of different SMEs (network engineers, database administrators, server administrators, and others). The success of the disaster recovery exercise relies on successful coordination and handoffs between these multiple SMEs, which is inherently risky. HP OO helps in the DR process by creating workflows that tie together diverse tools, processes, and domains so the risk of failure is significantly reduced. In addition, any risks arising from the unavailability of key personnel or groups are reduced due to the process information being captured in the workflows. The following example lists the failover steps required for a catastrophic failure on an email system. An HP OO workflow, along with a set of subflows, required to implement these recommended steps are also included in figure 1: 1. The DR event is declared (real or test). 2. Verify that the change requests in service desk systems (such as HP Service Manager) are approved. 3. Verify that network is operational. 4. Validate the health of the destination systems, including server and storage. Continued 8 of 20
How can HP OO help? How can HP OO help in DRP? (continued) 5. Verify that the configuration of the destination system is same as source system, including databases (SQL Server), application servers (Exchange) and Web servers. 6. Clone the destination server, if source and destination are not same. 7. Disable monitoring, clustering on the primary systems. 8. Perform failover tasks: a. Disconnect users and disable new connections. b. Open connections into destination systems. c. Reroute Domain Name Systems (DNSs) to point to destination systems. d. Deactivate primary systems. 9. Validate the availability of service for the new system. 10. Update change request ticket in service desk system. 11. Update configuration management database (CMDB) with current status, view reports to verify that failover completed successfully. 12. Re-enable monitoring and clustering. 13. Notify users and stakeholders. 14. Declare DR event complete. Continued 9 of 20
How can HP OO help? The steps listed in the previous section can be represented in an HP OO workflow as shown below. The workflow below may be triggered when a change ticket declaring the DRP event is approved. Figure 1: Implementation of a disaster recovery process using HP OO 1 2 not approved Take ownership of DR Event Check trouble ticket approval 3 Email approver Waiting for approval No action taken: Pending approval No action taken: Escalated Notify on-call Network health check 7 Diagnosed: Network issue Network issues 8 4 successful Resolved: Destination exchange server health check servers match Remove from exchange cluster 6 Notify SA group Disable monitoring aborted Failover from source to destination 9 5 servers do not match Notify on-call Compare source and destination server configs 12 Clone destination server 11 Error: 10 Validate destination availability Acknowledge DR event 14 13 Add to Enable Update CMDB exchange monitoring cluster Update ticket Notify SA group 10 of 20
How can HP OO help? Figure 1: Implementation of a disaster recovery process using HP OO (continued) 6 Clone destination server Copy exchange application files Failover from source to destination Configure logging Error: Notify exchange SA Register exchange services Resolved: 11 of 20
How can HP OO help? Figure 1: Implementation of a disaster recovery process using HP OO (continued) 8 Failover from source to destination Refuse new outlook connections Wait for all transactions to bleed off server Enable connections to destination server Notify network admin Continue with failover Notify SA group Manual SA decision Re-route DNS to destination Diagnosed: Network issue Abort failover Error: Failover aborted Notify SA Take source server offline Resolved: 12 of 20
Benefits Benefits of using HP OO for DRP Significant cost savings: HP OO helps significantly reduce the amount of time it takes to perform disaster recovery. One large customer has reduced the time to execute a single business-critical application failover from 215 staff hours to 45 staff hours, a 60 percent saving in time. At $100 USD per hour, that translates to a saving of $17,000 USD in pure staff hours. This does not yet include the costs of the errors from manual execution. 4 Another midsize HP OO customer cut down the failover time of its database infrastructure from over 45 minutes to less than 9 minutes. 5 This customer also realized annual savings of over $100,000 USD per flow that was running in the environment. Improved customer satisfaction: quicker failover and less downtime imply that the impact of a catastrophe on customers is reduced, thus resulting in more satisfied customers. System of Record for IT Process Automation: use of HP OO forces the experts to build their knowledge into the workflow, thereby creating a central repository for knowledge related to disaster recovery procedures. This reduces risks from an absent employee when disaster strikes. Enhanced process quality: use of HP OO forces the experts to change their mindsets from working on something until the job gets done, no matter what, to defining and agreeing on a single, predictable, and repeatable process that can be automated. 4, 5 These customer examples were collected by HP in February 2011. Continued 13 of 20
Benefits Benefits of using HP OO for DRP (continued) Reduced human errors: after knowledge has been translated into workflows inside HP OO, errors are less likely to occur when a real-life failover needs to be performed. Human errors related to stress, distractions, coffee breaks, lost post-it notes, and the like are largely reduced with institutionalizing the information in HP OO. Increased frequency of testing: after knowledge has been codified into HP OO workflows, the frequency of disaster recovery testing can be increased to keep pace with increasing system upgrades, regulatory requirements, or internal compliance mandates. Operations-led testing: HP OO allows embedding system credentials in flows for execution of specified workflows in a completely secure manner without requiring trading of passwords. This implies that the workflows can be run by operations personnel without requiring application specialists or database administrators, further reducing the staff required for the testing and lowering operational risk. 14 of 20
Content and features Content and features that support DRP HP OO includes a comprehensive set of features that has enabled customers to implement disaster recovery workflows in their environment. Features: HP OO provides a powerful visual workflow authoring tool with an enhanced debugger that allows creating brand new workflows or modifying of the ready-to-use workflows. HP OO also provides the ability to leverage existing scripts written in Perl, JavaScript, and VBScript to create workflows. Powerful multi-authoring capabilities allow multiple users to collaborate while writing and testing flows. The HP OO visual workflow also benefits organizations during the DRP process by allowing nondeveloper personas to review the entire end-to-end automated process and understand it. HP OO allows flows to be executed completely automatically, to be scheduled, or to be run in a guided mode with input prompts if necessary. This provides flexibility while planning and writing workflows. HP OO also includes automatic audit trails of workflows executed in the environment. The ability to use out-of-the-box dashboard reports, or create new ones offers a unique capability. Continued 15 of 20
Content and features Content and features that support DRP (continued) Table 1 provides a list of the typical tasks performed using HP OO and the features that support these tasks. Table 1. HP Operations Orchestration tasks Author Deploy Run Report Drag-and-drop workflow design tool Ready-to-use flow templates Ready-to-use integration adapters Built-in debugger Direct script import Publish and deploy Workflow sharing import and export Document generator Enterprise security model Single sign-on integration Visually guided mode Fully automated mode Scheduled mode Gated transitions Browse and search in browser user interface Automatic audit trails Ready-to-use dashboard reports Mean time to repair (MTTR) trending reports Built-in return on investment (ROI) calculator Dynamic drilldown Ready-to-use ITIL reports Custom reporting Content: Finally, HP OO includes over 4,000 ready-to-use operations, workflows, and integration adapters. The included operations and workflows offer tremendous flexibility in terms of being able to run flows on many different platforms and products. The comprehensive coverage of integration adapters for management products offers the freedom to use existing products without major tweaks or reprogramming. Continued 16 of 20
Content and features Content and features that support DRP (continued) Table 2 provides a list of the important Accelerator Packs and Integrations in HP OO. Table 2. Accelerator packs (workflow templates) and integrations Ready-to-use workflow templates Cloud OpenStack, Amazon EC2, HP Cloud, Savvis Virtualization VMware server, VMware vsphere, XenServer, KVM, Windows Hyper-V, Microsoft VMM, Microsoft Cluster Operating systems Microsoft Windows, Red Hat Linux, SUSE Linux, FreeBSD, Solaris, AIX Application servers BEA WebLogic, Citrix Presentation Server, JBoss, Apache Tomcat, IBM WebSphere Network Cisco Databases Oracle, Microsoft SQL Server, Sybase Others Microsoft Exchange, F5, Active Directory, Internet Information Services Ready-to-use integration adapters Service desk HP Service Desk, HP Service Manager, BMC Remedy, HP Peregrine Service Center, CA Service Desk, ServiceNow Monitoring HP OpenView Operations, HP Operations Manager, HP Network Node Manager, BMC Patrol, IBM Netcool, CA Network and Systems Management, IBM Tivoli, Microsoft Operations Manager, Microsoft System Center Operations Manager Configuration and change HP Server Automation, HP Network Automation, HP Client Automation, HP Storage Essentials, Microsoft SMS, Symantec Altiris CMDB HP Universal CMDB, BMC Atrium, CA CMDB, ServiceNow Continued 17 of 20
Summary HP Operation Orchestration helps you automate the testing of disaster recovery procedures and can also drive significant efficiencies to reduce errors in DRP by performing a number of repetitive and tedious tasks. By automating a large number of disaster recovery tasks, HP OO can drive down costs of performing DRP. The resulting cost savings and reduction of errors in DRP is achieved because HP OO includes a comprehensive set of features that has enabled customers to implement disaster recovery workflows in their environment. Global citizenship at HP At HP, global citizenship is our commitment to hold ourselves to high standards of integrity, contribution, and accountability in balancing our business goals with our impact on society and the planet. To learn more, visit hp.com/hpinfo/globalcitizenship, and for information about HP environmental programs, go to hp.com/environment. Learn more Automate the tasks and processes in your private, hybrid, and public cloud and traditional IT environments by using HP Operations Orchestration software. Visit hp.com/go/oo Continued 18 of 20
HP Services Get the most from your software investment. We know that your support challenges may vary according to the size and business-critical needs of your organization. HP provides technical software support services that address all aspects of your software lifecycle. This gives you the flexibility of choosing the appropriate support level to meet your specific IT and business needs. Use HP cost-effective software support to free up IT resources, so you can focus on other business priorities and innovation. HP Software Support Services gives you: One stop for all your software and hardware services saving you time with one call 24x7, 365 days a year. Offering you support for: VMware, Microsoft, Red Hat, and SUSE Linux as well as HP Insight Software Fast answers giving you technical expertise and remote tools to access fast answers, reactive problem resolution, and proactive problem prevention Global Reach Consistent Service Experience giving global technical expertise locally For more information go to hp.com/services/softwaresupport. For more information To learn more about HP Software Customer Connection, a one-stop information and learning portal for software products and services, visit hp.com/go/swcustomerconnection. 19 of 20
Get connected hp.com/go/getconnected Get the insider view on tech trends, support alerts, and HP solutions Copyright 2008, 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. Oracle is a registered trademark of Oracle and/or its affiliates. 4AA2-2314ENW, Created November 2008; Updated August 2012, Rev. 1 This is an HP Indigo digital print.