Incident Management Best Practices Chris Pope. Global Service Delivery Manager Global Managed Services Column Technologies.

Similar documents
SERV SER ICE OPERA OPERA ION

ITIL A guide to Event Management

ITIL A guide to event management

ITSM Process Description

Yale University Incident Management Process Guide

ITIL: Service Operation

ITIL by Test-king. Exam code: ITIL-F. Exam name: ITIL Foundation. Version 15.0

ITIL v3 Incident Management Process

BMC Remedy Service Desk: Incident Management User s Guide

CA Nimsoft Service Desk

REMEDY 7.5 INCIDENT MANAGEMENT AND CHANGE MANAGEMENT USER MANUAL

Commonwealth of Massachusetts IT Consolidation Phase 2. ITIL Process Flows

EXIN.Passguide.EX0-001.v by.SAM.424q. Exam Code: EX Exam Name: ITIL Foundation (syllabus 2011) Exam

IT Services. incident criteria

Which statement about Emergency Change Advisory Board (ECAB) is CORRECT?

Introduction Purpose... 4 Scope... 4 Manitoba ehealth Incident Management... 4 Icons... 4

ITIL Introducing service operation

SCUt ils SmartAssign Guide Solution for Microsoft System Center 2012 Service Manager

Introduction Purpose... 2 Scope... 2 Icons Tasks and ehealth Processes Incident Management... 3 Change Management...

ServiceNow Queue Manager Training

Helpdesk how to log a ticket and navigate.

Infasme Support. Incident Management Process. [Version 1.0]

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

The ITIL Foundation Examination

Service Desk. (Ver.Oct.2012)

Vistara Lifecycle Management

INCIDENT MANAGEMENT & REQUEST FULFILLMENT PROCESSES. Process Owner: Service Desk Manager. Version: v2.0. November 2014 Page 0

ITIL V3 Foundation Certification - Sample Exam 1

Fixes for CrossTec ResQDesk

Mortgage Quest WebDesk Setup and Login Instructions

Capita SIMS Partner Development Support Charter

ITIL v3. Service Management

Introduction Purpose... 4 Scope... 4 Manitoba ehealth Change Management... 4 Icons RFC Procedures... 5

1. INCIDENT MANAGEMENT

Service Asset & Configuration Management PinkVERIFY

Cloud-based Managed Services for SAP. Service Catalogue

ITIL v3 (Lecture III) Service Management as a Practice IT Operation

Process Description Change Management

8 Best Practices for IT Incident Management

Role Profile. Job No. (Office Use) A79

INFORMATION TECHNOLOGY STANDARD

Information and Communication Technology. Helpdesk Support Procedure

ITIL Service Lifecycle Operation

TechExcel. ITIL Process Guide. Sample Project for Incident Management, Change Management, and Problem Management. Certified

Novo Service Desk Software

UW Connect Update & Incident Management Overview

Closed Loop Incident Process

Summit Platform. IT and Business Challenges. SUMMUS IT Management Solutions. IT Service Management (ITSM) Datasheet. Key Benefits

Overview. Table of Contents. isupport Incident Management

Supporting GIS Best practices for Incident Management and Daily Operations

HUIT Change Management with ServiceNow. September 2013

The ITIL v.3. Foundation Examination

The ITIL Foundation Examination

The ITIL Foundation Examination

Free ITIL v.3. Foundation. Exam Sample Paper 1. You have 1 hour to complete all 40 Questions. You must get 26 or more correct to pass

HDA Integration Guide. Help Desk Authority 9.0

HP Service Manager software

Release Management PinkVerify v2.1. Mandatory Criteria

The ITIL v.3 Foundation Examination

The ITIL Foundation Examination Sample Paper A, version 5.1

ITIL Service Lifecycle: Service Operation

Analytics Reporting Service

Automating ITIL v3 Event Management with IT Process Automation: Improving Quality while Reducing Expense

The ITIL Foundation Examination

Request Fulfillment Management. ITG s CENTRE Service Record Screen

1 Why should monitoring and measuring be used when trying to improve services?

ITIL v3 Service Manager Bridge

Process Description Incident/Request. HUIT Process Description v6.docx February 12, 2013 Version 6

ELIMINATE RECURRING INCIDENTS

By default, the Dashboard Search Lists show tickets in all statuses except Closed.

Workflow Templates Library

The ITIL Foundation Examination

Problem Management Overview HDI Capital Area Chapter September 16, 2009 Hugo Mendoza, Column Technologies

HP Service Manager. Service Desk help topics for printing. For the supported Windows and UNIX operating systems. Software Version: 9.

How To Manage An Incident Ticket In Service-Now.Com

Cherwell Training: Help Desk Consultant

CA Service Desk Manager (SDM) r12.6 SDM Basic Navigation and Functionality

CA Service Desk Manager

Information Technology Engineers Examination. Information Technology Service Manager Examination. (Level 4) Syllabus

Enabling ITIL Best Practices Through Oracle Enterprise Manager, Session # Ana Mccollum Enterprise Management, Product Management

Command Center Handbook

Fermilab Computing Division Service Level Management Process & Procedures Document

BSDI Advanced Fitness & Wellness Software

SERVICE DESK CRITICAL USER PROCEDURE

White Paper Case Study: How Collaboration Platforms Support the ITIL Best Practices Standard

ITIL A guide to incident management

The Career Management System TM

How To Create A Help Desk For A System Center System Manager

All other issues are to be submitted via a request ticket utilizing the Web Helpdesk found at

Analyst Guide for Request Support -- Incident/Service Request

Real World Proactive ITIL Continuous Improvement Practices Part 1. Mickey Nakamura

BMC Remedy Incident Management Quick Start User Guide Training Manual. Version 3.0

ENTERPRISE SERVICE DESK (ESD) SERVICE DELIVERY GUIDE

vrealize Operations Manager User Guide

HP Service Manager. Software Version: 9.34 For the supported Windows and UNIX operating systems. Service Desk help topics for printing

ITIL Essentials Study Guide

Northgate Public Services

The ITIL Foundation Examination

BCS Specialist Certificate in Service Desk & Incident Management Syllabus

Risk profile table for deployment of releases to the main web site. High Acceptable Unacceptable Unacceptable

Transcription:

Incident Management Best Practices Chris Pope Global Service Delivery Manager Global Managed Services Column Technologies February 2009

Agenda & Objectives 1. Incident Management Overview 2. Changes in Incident Management & Service Operation 3. Real World Incident Management and what really happens 4. Improvement 5. Use Case 6. Metrics 7. Documentation Examples 8. Tips 9. Questions

Incident Management What does ITIL say? Definition An unplanned interruption to an IT service or reduction in the quality of an IT service. Failure of a configuration item that has not yet impacted service is also an incident, for example failure of one disk from a mirror set. Incident management is the process for dealing with all incident, than can include Failures, questions or queries reported by the users (usually via telephone call to the service desk, by technical staff, or automatically detected and reported by event Monitoring tools Goal / Objective The Primary goal of the Incident Management process is to restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring the best possible levels of service quailty and availability are maintained

ITIL V3 and Service Operation New terms and definitions added that define inputs and triggers to Incident Management Event Management An event can be defined as any detectable or discernible occurrence that has Significance for the management of the IT Infrastructure or the delivery of IT Services And evaluation of the impact a deviation might cause to the services. Events are Typically notifications created by an IT service, Configuration Item (CI) or monitoring Tool. Types of Events Informational - An event that does not require any action and does not represent an Exception (a user logs on successfully, a batch job completes, a device comes online) Warning An event that is generated when a service or device is approaching a Threshold (Memory Utilization, Network collision rate is x) Exception An exception means that a service or device is currently operating Abnormally (however that has been defined) Typically this means an SLA/OLA has been breached and the business is being impacted

ITIL V3 and Service Operation Request Fulfillment The term Service Request is used as a generic description for many varying types Of demands that are placed upon the IT Department by the users. Many are actually Small changes low risk, frequently occurring, low cost etc (password reset, install Additional software) or maybe just a question requesting information. Goal / Objective To provide a channel for users to request and receive standard services for which a Pre-defined approval and qualification process exists. To Provide information to users and customers about the availability of services and the procedure for obtaining them. to source and deliver the components of requested standard services. To assist with general information, complaints and comments

Day to Day Incident Management Structured, documented, repeatable process Clear Guidelines on what to do and when Everybody knows what they are doing and their role Everybody has been trained on the latest version of the tools The boss is in his office waiting for updates The business understands what is happening and that IT is Working to restore service as soon as possible

Real Day to Day Incident Management Emails, Phone calls, network events, desktop issues Day to day pressures of projects, tasks, changes, new initiatives Events Incidents Ticket? Phone calls? Unable to Identify Impact Multiple events Time of day? Difficult to focus on Service restoration Poorly integrated tools consume time and resources Data Quality challenges hinder impact and risk analysis Poor communication processes consume time and resources Increased MMTR and Service Restoration

How can we improve the process? Training Major focus for process improvement Roles and Responsibilities clearly defined People know what to do and when Continuous improvement Communication Clearly define communication tools Escalation mechanism Ensuring your customer is aware at all times Full contact, full exposure Tools Documentation Ease of use Focused and relevant Fir for use and purpose Simple, reliable, readily available Up to date Has clear ownership Quality & Consistency Integrations Drive decisions more intelligently Single pane of glass What can help you reduce MTTR and restore service quickly Data Quality Is the data accurate? Clear ownership Refresh cycle and maintenance

Use Case Background Global Financial company 25000+ Employees 3500+ Applications 20000+ Servers Specifics IT division with 1100+ developers 300+ Applications support trading floor 24x7 globally operational, trading on all markets Focused on raising the bar, new functionality supersedes existing bugs/issues Little to no incident management structure in place Action Taken 1. Identified key stakeholders by function for both the business and IT 2. Empowered stakeholders to be able to make decisions and resolve conflicts 3. Instigated a global training plan, mandatory attendance 4. Agreed on what constitutes a Low, Medium, High, Critical incident 5. Integrated tools focusing on usability/simplicity 6. Established a robust program to Manage IT by Metrics 7. Defined clear escalation paths 8. Instigated a culture change, its OK to have a High/Critical Incident

The Metrics

The Metrics (cont)

How can I do this? This is not the first time its been done! Utilize ITIL where it makes sense, if it doesn t, don t use it! Training is the big ticket, establish buy in and value, you then have accountability Concentrate on 3-4 key measures or metrics and focused on them..driving the increase in incidents being recorded, its Ok to have a High/Critical Incident Be a wingman Establish accountability and manage it Celebrate the success, learn from the errors, don t criticize if someone gets it wrong Distribute guides, wallet cards, desk reminders, screensaver to enforce the message and the how Don t hesitate escalate

Escalation Criteria

Incident Timeline

Start the Process early Don t wait for Services or CI s to break in production before figuring out what you need to know or do with them. Start early in the lifecycle, before services are in a Production status Establish standards for monitoring, platform, management and process Permit To Operate (PTO) Encourage escalation Empower people to make a decision, if it s the wrong one, review and follow up

Questions