White Paper. Business Service Management Solution



Similar documents
Work Smarter, Not Harder: Leveraging IT Analytics to Simplify Operations and Improve the Customer Experience

CA Oblicore Guarantee for Managed Service Providers

can you improve service quality and availability while optimizing operations on VCE Vblock Systems?

CA Spectrum r Overview. agility made possible

Kaseya Traverse. Kaseya Product Brief. Predictive SLA Management and Monitoring. Kaseya Traverse. Service Containers and Views

IntelliNet Delivers APM Service with CA Nimsoft Monitor

CA NetQoS Unified Communications Monitor

CA Configuration Automation

Problem Management: A CA Service Management Process Map

are you helping your customers achieve their expectations for IT based service quality and availability?

CA ehealth. Overview. Benefits. agility made possible

IBM Tivoli Netcool network management solutions for enterprise

Stefanini helps customers achieve cost avoidance savings with CA Service Desk Manager

CA Systems Performance for Infrastructure Managers

Transforming IT Processes and Culture to Assure Service Quality and Improve IT Operational Efficiency

CA Virtual Assurance for Infrastructure Managers

CA Cloud Service Delivery Platform

agility made possible

Network Performance Management Solutions Architecture

The Advantages of Converged Infrastructure Management

CA Virtual Assurance for Infrastructure Managers

CA Application Performance Management Cloud Monitor

CA Cloud Service Delivery Platform

Business Service Management Links IT Services to Business Goals

How Can I Deliver Innovative Customer Services Across Increasingly Complex, Converged Infrastructure With Less Management Effort And Lower Cost?

how can I deliver better services to my customers and grow revenue?

ROI Business Use Case. Cross-Enterprise Application Performance Management. Helps Reduce Costs & MTTR, Simplify Management, Improve Service Quality

CA Compliance Manager for z/os

CA SOLVE:Central Service Desk for z/os

Asentinel Telecom Expense Management (TEM)

Becoming Proactive in Application Management and Monitoring

Implement a unified approach to service quality management.

Taking the Service Desk to the Next Level BEST PRACTICES WHITE PAPER

CA NSM System Monitoring Option for OpenVMS r3.2

CA Service Desk Manager

Capacity Management Plan

Fujitsu Australia and New Zealand provides cost-effective and flexible cloud services with CA Technologies solutions

End Your Data Center Logging Chaos with VMware vcenter Log Insight

CA Process Automation for System z 3.1

CA Service Desk On-Demand

Managed Services Technology Stack

Automated IT Asset Management Maximize organizational value using BMC Track-It! WHITE PAPER

CA Workload Automation

agility made possible

Best Practices: Modeling Virtual Environments in SPECTRUM

Clarity Assurance allows operators to monitor and manage the availability and quality of their network and services

Controlling and Managing Security with Performance Tools

CA Database Performance

Availability Management: A CA Service Management Process Map

how can I improve performance of my customer service level agreements while reducing cost?

CA Technologies optimizes business systems worldwide with enterprise data model

Deploying the CMDB for Change & Configuration Management

CA SiteMinder SSO Agents for ERP Systems

agility made possible

ScaleMatrix safeguards 100 terabytes of data and continuity of cloud services with CA Technologies

Total Protection for Compliance: Unified IT Policy Auditing

For Service Providers. Copyright 2010 EMC Corporation. All rights reserved.

Intelligent Network Management System. Comprehensive Network Visibility and Management for Wireless and Fixed Networks

SOLUTION BRIEF CA SERVICE MANAGEMENT - SERVICE CATALOG. Can We Manage and Deliver the Services Needed Where, When and How Our Users Need Them?

Network change is constant: Configuration and compliance management can help

CA SPECTRUM Network Fault Manager

HP Systems Insight Manager and HP OpenView

Track-It! 8.5. The World s Most Widely Installed Help Desk and Asset Management Solution

CA Service Desk Manager

WHITE PAPER. Automated IT Asset Management Maximize Organizational Value Using Numara Track-It! p: f:

MRV EMPOWERS THE OPTICAL EDGE.

TABLE OF CONTENTS. 1...Introducing N-central 3...What You Can Do With N-central 4...MONITOR: Proactively Identify Potential Problems

CA HalvesThe Cost Of Testing IT Controls For Sarbanes-Oxley Compliance With Unified Processes.

CA Clarity PPM. Overview. Benefits. agility made possible

Use product solutions from IBM Tivoli software to align with the best practices of the Information Technology Infrastructure Library (ITIL).

Improve end-to-end management with IBM consolidated operations management solutions.

Riverbed SteelCentral. Product Family Brochure

CA SPECTRUM. Redefining Network Fault Management. CA Network & Voice Management Solutions CA SPECTRUM CA ehealth CA ehealth for Voice

CA Scheduler Job Management r11

How To Manage A Virtualized Network

PERFORMANCE MANAGER. Carrier-grade voice performance monitoring tools for the enterprise. Resolve service issues before they impact your business.

ORACLE ENTERPRISE MANAGER 10 g CONFIGURATION MANAGEMENT PACK FOR ORACLE DATABASE

how do I tame IT complexity, align with business goals, and deliver a great end-user experience?

effective performance monitoring in SAP environments

How To Create A Help Desk For A System Center System Manager

The Future of Workload Automation in the Application Economy

Measuring end-to-end application performance in an on-demand world. Shajeer Mohammed Enterprise Architect

Redefining Infrastructure Management for Today s Application Economy

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011

Riverbed SteelCentral. Product Family Brochure

CA Repository for z/os r7.2

Atkins safeguards availability of client s geospatial systems with a CA AppLogic private cloud environment

Transcription:

White Paper Business Service Management Solution Eric Stinson, September 2005

Executive Summary With services and Service Level Agreements (SLAs) being direct sources of revenue (or penalties) for service providers and the basis of business processes in enterprises, they must be managed effectively to ensure quality services are delivered and SLAs are met. It is no longer acceptable to provide component-level management of the network, applications, systems, or security in separate silos. In order to manage effectively, the IT staff needs real-time alarms on services as well as SLAs that are in danger of violation. These alarms must provide the root cause and the impact analysis... allowing them to be addressed quickly and in order of priority. Finally, organizations must have the ability to report on both service performance and SLA compliance. CA s Network Service Manager application works in conjunction with CA's Network Fault Management (SPECTRUM ) software platform to provide Business Service Intelligence real-time and historical management of services, SLAs, and customers. The Service Manager application logically manages the IT infrastructure tying network components, servers, and applications together into logical services such as email and other hosted applications, Internet access, WAN connectivity, and more. It also ties customers and SLAs to the services so faults can be prioritized based on the importance of the services, customers, and SLAs that are affected. Real-time alarms are generated, warning of service outages and impending SLA violations, including the root cause, allowing them to be addressed quickly before the business is severely impacted. The Service Manager application also provides historical reports showing past performance and details of degradations and outages, allowing the business to find new ways to improve the services over time. The first step of any service-centric management initiative should be the discovery, definition and documentation of offered services by the provider of those services. An example would be an IT department publishing a menu of available services such as email, remote access, CRM, ERP, etc. It is important to note, however, that only one service should be picked first to go through this process. Arguably that first service should be the one with the highest cost, or the greatest source of profitability. For many enterprises, the highest cost service is monthly recurring WAN services. For a service provider, the highest source of profitability might be pay-per-view video service. The key point is to pick one service or business process; model the relationships and dependencies; understand what is normal behavior; and then notify upon exception or deviation from that normal state. Document and communicate success on managing this first service before moving on to the next one. Organizations that have been successful with Service Level Management projects have started small, shown success and then expanded in a controlled manner. The end goal is a cost-effective method to measure service levels in business terms, set objectives, predict results, publish reports and offer service-level guarantees which link IT infrastructure status and performance with business-oriented IT services. Service Management Challenges There are many issues facing service providers and enterprises today when it comes to managing the services they deliver and ensuring compliance with the SLAs that guarantee them. As service providers and enterprise businesses become increasingly reliant on the services their IT infrastructures provide, it is becoming important for them to manage the services, not just the individual infrastructure components such as routers, switches, servers, and applications. It is no longer good enough to know a router is down; businesses must know which services are down and which customers are impacted in order to prioritize fixes and keep the business running efficiently. Every minute of downtime or degraded service costs money, so the issues must be addressed quickly and effectively. Service providers are being forced to offer SLAs on the services they sell to customers either to stay competitive in the marketplace or by the customers themselves. The SLAs from service providers typically include penalty clauses for violating the service guarantees. In order to remain competitive and profitable, it is important to continuously comply with the SLAs. In addition to SLAs between external service providers and enterprises, the high reliance on IT services is pushing many enterprises to request or require SLAs from their internal IT department and provide reports to upper management showing the service levels and how they relate to the SLAs. In most environments, the IT department has no means to measure or report on these SLAs and is being forced to build customized applications in an attempt to deliver this. It is well known that custom development within a corporation is very expensive up front and even more expensive to maintain as things change quickly. To effectively offer and meet SLAs in both environments, Service Level Management must be implemented. Service providers are also moving toward a model of proactive customer notification for outages. The problem is that when a router, switch, server, or application is degraded or fails, it is difficult to know exactly who is impacted. Service providers need to know the impact of each outage. Even in situations where proactive notification isn't taking place, call centers are swamped 2

with phone calls when problems occur and need a way to quickly answer these calls or set up an automated message stating the problem is known and the estimated time to correct it. Service Management is required to provide a dashboard view of services and customers with their status to allow call centers to be more efficient in these situations. Finally, enterprises purchase many services from their service providers today but don't have an effective way to validate the service provider's SLAs. Many SLAs require the customer to notify the service provider of a breached SLA in order to be compensated for the violation. Without a Service Level Management system, it is difficult to ascertain when service levels are not met and to prove the claim. Service Management Functionality CA's Service Manager application meets today's challenges in proactively managing services and SLAs. It provides a service dashboard, service and SLA reports, fault impact analysis, and a flexible configuration interface. Service Correlation In line with the heavy reliance on IT services, the Service Manager application allows management in terms of the services and value the infrastructure provides, not just the components. This allows operators to understand problems with services, as well as devices, eliminating the tedious task of trying to figure out which services are affected. CA's Service Manager application provides real-time alarms on these services when they are down or degraded, allowing operators to respond quickly and enabling the entire business to run smoothly. The solution also measures the condition, statistics, events, and performance information for underlying service components, allowing flexibility to measure the things most important to your customers and business. This management information is correlated to a single logical service where the health and availability of that service is managed. By understanding all the components that make up the service, the events that affect the service, and the interrelationships, the Service Manager application is able to pinpoint the root cause of any service problems for faster resolution. To accomplish this, we have developed an advanced event correlation engine that correlates events across multiple silos (network, application, security, etc) to detect root cause problems, even in cases where the root cause may not be directly monitored by the software. We first patented crosssilo correlation in 1998. The service correlation is built on top of the SPECTRUM platform's correlation engine, allowing services to be correlated to faults, performance problems and events that affect the service even if they occur on components not manually added to the service definition. This is done by extending root cause and impact analysis to include things that impact any service component. For example, if a service is defined to include a server, remote performance tests, and a WAN link, and then a router goes down that causes the server to go down, CA s Service Manager application will report the router as the root cause of the outage. Further, the router alarm will indicate that it is impacting the service and that service s customers, even though the router was not manually defined as being part of the service. Since all services must be maintained, the Service Manager application provides configurable maintenance windows for managed services. These may be determined ahead of time and scheduled during specific times of specific days, or can be scheduled ad-hoc to allow maintenance of a service. This maintenance window is also factored into SLA calculations where the service's performance during maintenance is not counted against the SLA. CA s Service Manager application takes maintenance windows a step further by allowing the user to examine the service at its component level to determine the allowed maintenance window for each component, so that a maintenance window can be found that affects the fewest number of services for the least amount of time. Proactive Service Level Agreement Management As most service providers are currently offering SLAs, and enterprises are moving toward offering SLAs to their internal customers, an SLA management tool is needed to ensure SLAs are met, eliminating the need to pay penalties for noncompliance. CA's Service Manager application addresses this need by offering proactive SLA management. It monitors the SLAs that are guaranteeing the managed services and generates early warning alarms and trending to indicate pending SLA violations. This allows particular attention to be paid to that SLA to ensure it is not violated. Further, the Service Manager application offers web-based reports for services and SLAs. Service providers and enterprises may use these reports to verify and validate the service performance to their customers. They may also be used by an enterprise to understand if the SLAs with their service providers are violated and if so, provide the information required to validate the claim. CA's solution allows SLAs to be made up of many different guarantees such as availability, performance, Mean Time Between Failures (MTBF), Mean Time To Repair (MTTR), and other metrics based on statistics gathered by network, systems, and application managers. This provides the flexibility required to monitor the real contract between the provider and the user. 3

Figure 1: SPECTRUM OneClick Service Dashboard Service Dashboard The Service Manager application includes a Service Dashboard that provides the appropriate information for use by executives, business managers, call centers, and customers. This dashboard is built on the SPECTRUM OneClick architecture for easy access, distribution, and administration. The dashboard provides at-a-glance information for services, customers, and SLAs and is configurable through preferences and favorites to allow users to view the information most important to them. In addition to real-time status, the dashboard provides summary information for services and customers to include MTBF, MTTR, time since last outage, and more. When a service or customer is in a problem state, details are provided to show the assigned troubleshooter, duration of outage, affected users, and more. The current state of the SLA is given, with details available such as total down or degraded time counted against the SLA and the time left before violation. The dashboard also provides SLA trending showing if and when an SLA is in danger of violation before the end of the period based on the performance and availability trends to date for the SLA period. From the Service Dashboard, users have direct links to detailed historical reports and OneClick Console users have direct links into root cause analysis details and troubleshooting information. Historical Service and SLA Reports In addition to real-time Service Level Management, the Service Manager application provides historical reporting on both services and SLAs to meet the needs of the service provider and enterprise organizations. These reports include summary data as well as allow drilling in to detailed information. The Service Manager application provides role-based reports for customers, many levels of management, and IT. 4

Service Summary reports include the total up, down, or degraded time (defined by health index) for that service as well as the MTBF and MTTR. Drilling in to the details will show each outage or degraded performance state, as well as the root cause. These can be used by both upper management as well as administrators to track services, gauge their health, and look for methods to improve service performance. Figure 2: Customer SLA Detail Report Service Component Outage reports show all infrastructure outages that have impacted services and SLAs. Traditional outage reporting shows all the components that have had outages over the report timeframe, and can sort them based on the amount of outages. However, not all these outages necessarily impacted business services. By correlating the outages to the impact on services and SLAs, the Service Component Outage reports indicate where to focus infrastructure improvements to provide the highest impact on improving business services. SLA Summary reports show the high-level status of SLAs on a customer-by-customer basis to show if the SLA is being met or violated. Drilling into these SLA reports shows the details of each guarantee associated with the SLA such as availability, response time, or other statistics. Further drilling also shows the specific events that have affected the SLA and their root cause. The SLA reports allow the administrator to mark specific events as SLAexempt when that event should not count against the SLA. An example of this is where a service provider guarantees an Internet connection with a stipulation that the customer provides power to the CPE device and that power will not be lost at the customer site. This is obviously not the fault of the service provider and should not count against the SLA. environments, Figure and tools 3: Service have been Health embraced, Report the has Inventory reports show the inventory of service and SLAs. For services, it includes the service components and relationships to the service. SLA Inventory reports include a list of SLAs along with the details of the guarantees that make them up. 5

Fault Impact Analysis To facilitate proactive notification as well as understand the impact of faults in the infrastructure, we built complete fault impact analysis into the Service Manager application. The impact analysis is done on all faults that affect a service including performance, response time, network device outages, application problems, and more by showing exactly what services, SLAs, and customers are affected by each fault. The Service Manager application also stores the customer contact information for each customer, allowing the Service Desk to proactively notify customers of problems with their services. For troubleshooters and technicians, the Service Manager application goes a step further with impact analysis and calculates the impact severity of infrastructure resource problems based on the priority assigned to the services, customers, and SLAs affected by the resource. This allows businesses to address all issues in a prioritized manner, ensuring they don't waste valuable time on an issue that will cause little pain vs. spending time on the more important issues. It also enables users to quickly and efficiently access information such as root cause, symptomatic events, correlated events, real-time performance graphs and more, allowing them to quickly fix the problem and get the service back to normal condition. Flexible Configuration We designed the Service Manager application for use in the large, dynamic environments typical of service providers and large enterprises. The solution has flexible configuration options including Service and SLA templates, automatic updates and monitoring as new devices are added to a service, and an Operations Support Systems (OSS) integration gateway. Service and SLA templates are available to aid in up-front configuration as well as ease the burden of frequent adds/moves/changes. These templates are fully customizable for any environment and allow the most Figure 4: SPECTRUM OneClick Console 6

consistent information to be stored and used to create services and SLAs by simply adding the unique information to that service or SLA. In order to simplify changes,the Service Manager application allows changes to the templates to be globally applied to all services and SLAs created from them. For example, if the Gold SLA is changed to provide a 20ms response time vs. the original 25ms response time, the change can be made in one location and then globally applied to all SLAs created from the Gold SLA template. In addition to templates, the Service Manager application provides the ability to automatically update services and SLA metrics as devices, systems, or applications are added, deleted, or changed within a service. It uses dynamic service policies to automatically add or delete objects from the service and monitor them appropriately based on object attributes as the objects within the service are changed. This eliminates the need to manually reconfigure the services that depend on a specific object as the object is deleted or changed, or to manually reconfigure them to use a new object that is added to the infrastructure. Finally, for service providers with OSS systems currently in place, we provide a gateway to import information from the OSS into the Service Manager application to create both services and SLAs. This simplifies the process of recreating information or manually transferring information from one system to another. Conclusion The management paradigm is shifting from managing components within silos to managing the services provided by IT Infrastructures, and the customers they serve. Without making this shift, it is hard for companies to live up to the business needs that rely on the IT infrastructure. CA's Service Manager application seamlessly manages services, customers, and SLAs in order to provide the required quality of service and ensure success. About the Author Eric Stinson is a Principal Product Manager for CA s Enterprise Systems Management business unit. Copyright 2006 CA. All rights reserved. All trademarks, trade names, service marks and logos referenced herein belong to their respective companies. This document is for your informational purposes only. To the extent permitted by applicable law, CA provides this document AS IS without warranty of any kind, including, without limitation, any implied warranties of merchantability, fitness for a particular purpose, or non-infringement. In no event will CA be liable for any loss or damage, direct or indirect, from the use of this document, including, without limitation, lost profits, business interruption, goodwill or lost data, even if CA is expressly advised of such damages. MP28912