CRITICAL INFRASTRUCTURE: Strategies for Effective NOC Management for Business Success



Similar documents
Report of Independent Auditors

Tel: Fax: ey.com. Report of Independent Auditors

Blackboard Managed Hosting SM Disaster Recovery Planning Document

Ongoing Help Desk Management Plan

END TO END DATA CENTRE SOLUTIONS COMPANY PROFILE

Empowering the Enterprise Through Unified Communications & Managed Services Solutions

Asia Pacific Managed Services Operational Excellence. Anywhere

Evaluating Internal and Outsourced Models for Network Monitoring

Managed Security Services for Data

Network Management and Monitoring Software

Managed Services. Business Intelligence Solutions

CA Service Desk Manager

Pacnet Premium Dedicated Internet Access Dedicated Internet Access for Web-Centric Enterprises

2014 Defense Health Information Technology Symposium Service Desk Strategy for the Defense Health Agency

Global IP Network Overview

MPLS WAN Explorer. Enterprise Network Management Visibility through the MPLS VPN Cloud

Ensuring IP Telephony Performance Through Remote Monitoring

Injazat s Managed Services Portfolio

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

Improving. Summary. gathered from. research, and. Burnout of. Whitepaper

Monitoring, Managing, Remediating

Best Practices for Building a Security Operations Center

Best Practices for Eliminating Risk from Routing Changes

R3: Windows Server 2008 Administration. Course Overview. Course Outline. Course Length: 4 Day

State of Wisconsin Initial Incident Triage Service Service Offering Definition (SOD)

Lumos Parallel Network Operations Centers: Protected Network Monitoring

VERISIGN DDoS PROTECTION SERVICES CUSTOMER HANDBOOK

Vistara Lifecycle Management

Managing and Maintaining Windows Server 2008 Servers

A FAULT MANAGEMENT WHITEPAPER

FatPipe Networks

Managed Security Services SLA Document. Response and Resolution Times

Network Performance + Security Monitoring

SERVICE LEVEL AGREEMENT

Enterprise UNIX Services - Systems Support - Extended

BridgeConnex Statement of Work Managed Network Services (MNS) & Network Monitoring Services (NMS)

Security Policy JUNE 1, SalesNOW. Security Policy v v

Network Management System (NMS) FAQ

Network-Wide Change Management Visibility with Route Analytics

TOP 10 WAYS TO ADDRESS PCI DSS COMPLIANCE. ebook Series

Drive Down IT Operations Cost with Multi-Level Automation

ASX CLEAR (FUTURES) OPERATING RULES Guidance Note 10

THE VX 9000: THE WORLD S FIRST SCALABLE, VIRTUALIZED WLAN CONTROLLER BRINGS A NEW LEVEL OF SCALABILITY, COST-EFFICIENCY AND RELIABILITY TO THE WLAN

The remedies set forth in this SLA are your sole and exclusive remedies for any failure of the service.

PC Proactive Solutions Technical View

orrelog Ping Monitor Adapter Software Users Manual

N-Wave Networking Services Service Catalog

Ensuring your DR plan does not Lead to a Disaster

Data Center Colocation - SLA

Real-Time Traffic Engineering Management With Route Analytics

Network Security Policy: Best Practices White Paper

ASX SETTLEMENT OPERATING RULES Guidance Note 10

Kaseya Traverse. Kaseya Product Brief. Predictive SLA Management and Monitoring. Kaseya Traverse. Service Containers and Views

Der Weg, wie die Verantwortung getragen werden kann!

Your Infrastructure. Our Responsibility.

Monitoring Your IP Telephony Network

Troubleshooting and Maintaining Cisco IP Networks Volume 1

Managed Hosting Evaluating Blackboard Managed Hosting Vs. Self Hosting

AIRDEFENSE SOLUTIONS PROTECT YOUR WIRELESS NETWORK AND YOUR CRITICAL DATA SECURITY AND COMPLIANCE

NETWORK AND SERVER MANAGEMENT

Achieving Service Quality and Availability Using Cisco Unified Communications Management Suite

Application Performance Management

Capacity Management Plan

Systems Support - Standard

Get what s right for your business. Technologies.

Cisco Disaster Recovery: Best Practices White Paper

Fully Managed IT Support. Proactive Maintenance. Disaster Recovery. Remote Support. Service Desk. Call Centre. Fully Managed Services Guide July 2007

Regaining MPLS VPN WAN Visibility with Route Analytics. Seeing through the MPLS VPN Cloud

MEDIAROOM. Products Hosting Infrastructure Documentation. Introduction. Hosting Facility Overview

Network-Wide Class of Service (CoS) Management with Route Analytics. Integrated Traffic and Routing Visibility for Effective CoS Delivery

How to Evaluate DDoS Mitigation Providers:

Cisco Change Management: Best Practices White Paper

CHAPTER 4 : CASE STUDY WEB APPLICATION DDOS ATTACK GLOBAL THREAT INTELLIGENCE REPORT 2015 :: COPYRIGHT 2015 NTT INNOVATION INSTITUTE 1 LLC

Network Computing Architects Inc. (NCA) Network Operations Center (NOC) Services

DDL Systems, Inc. ACO MONITOR : Managing your IBM i (or AS/400) using wireless devices. Technical White Paper. April 2014

Command Center Handbook

Project Management and ITIL Transitions

W H I T E P A P E R T h e I m p a c t o f A u t o S u p p o r t: Leveraging Advanced Remote and Automated Support

10 Tips to Better Manage Your Service Team

Oracle Maps Cloud Service Enterprise Hosting and Delivery Policies Effective Date: October 1, 2015 Version 1.0

Making the Business and IT Case for Dedicated Hosting

Intrado Inc. (as successor in interest to Connexon Telecom Inc) Support Policy

Enhancing Network Monitoring with Route Analytics

1. Thwart attacks on your network.

Backup Your Data and Keep it in Canada

Voyageur Internet Inc. 323 Edwin Street Winnipeg MB R3B 0Y7 Main: (204) Fax: (204)

RSA envision. Platform. Real-time Actionable Security Information, Streamlined Incident Handling, Effective Security Measures. RSA Solution Brief

CISM Certified Information Security Manager

GMI CLOUD SERVICES. GMI Business Services To Be Migrated: Deployment, Migration, Security, Management

Transcription:

CRITICAL INFRASTRUCTURE: Strategies for Effective NOC Management for Business Success NTT America AN NTT COMMUNICATIONS WHITE PAPER OCTOBER 2011

Introduction Today, with the rise of streaming content, demands on the Internet have never been higher. Given the ever-evolving nature of the Internet itself, these expectations will only continue to grow exponentially. Content networks and service providers worldwide need a cost-effective way to offer a wide range of services to their customers but are faced with security and performance challenges every day that could cripple their networks. For the cutting-edge businesses that are driving the IP applications of tomorrow, speed, reliability, redundancy and security are crucial. In most cases, these requirements fall to their transit provider and, specifically, their transit provider s Network Operations Center (NOC). If the NOC overlooks a critical issue or bottleneck, it most likely will not be found until a disgruntled customer (or hundreds of customers) calls in reporting significantly impacted network performance which often results in an equally as significant impact on the customer s bottom line. Given all of these expectations, running an effective and efficient NOC is no small order, and expert NOC management needs to be at the top of any business s priority list when considering Tier-1 IP transit providers. The following white paper presents an overview of NOC capabilities, best practices for running a NOC as well as key considerations for choosing a Tier-1 transit provider with a best-in-class NOC. The NOC s Critical Role The NOC, as well as being the first line of defense, is also the last line of defense the backstop. Playing baseball without a backstop (or a backstop with holes in it) is dangerous; Sooner or later the team will get burned. It is critical that a NOC has the right tools and personnel to do the job. A NOC and its uses vary from provider to provider, but most provide a number of services to both customers and non-customers alike. First and foremost, a NOC s job is to monitor an entire network 24x7, enabling its engineers to quickly fix network issues before they have a major impact on customers. Standard daily processes include: Monitoring operations of all backbone links and network devices; Ensuring continuous operation of servers and services; Providing quality support for network users; Troubleshooting of all network and system related problems; and Opening tickets to track and document resolution of problems. NOCs are also typically the front line for customer support on a wide range of issues, including emergencies such as Denial-of-Service (DDoS) attacks, loss of connectivity and security issues. When choosing a Tier-1 IP transit provider, potential customers should consider the provider s NOC strength and capabilities as a crucial factor. Below are three considerations for effective NOC management. NTT America Page 2

1. Technology and Configuration The Network Monitor System (NMS) is the eyes and ears of the NOC and should include ICMP, SNMP Polling and TRAP receiving. For large networks, it should have intelligence of network dependencies and Root Cause Analysis (RCA) to avoid the dreaded sea of red, which occurs when an NMS interface is flooded by critical alerts, making it difficult to find the root cause of an issue. Additionally, it should communicate with internal databases for Event Enrichment purposes, auto-populating alerts with such information as router, IP address, customer POC, etc., so the NOC engineers will not have to replicate data or open multiple screens when addressing a problem. Speed and accuracy in addition to a skilled staff is the key to quick resolution. In addition to the NMS, configuration tools used to generate router configurations can enforce standardization and reduce human error, as well as offer uniform installation, operation and decommissioning of devices. Configuration tools provide an extra benefit of being an automatic backup of all device configurations. The NOC, as well as being the first line of defense, is also the last line of defense the backstop. Playing baseball without a backstop (or a backstop with holes in it) is dangerous; Sooner or later the team will get burned. 2. Staffing In order to ensure that issues are being handled most efficiently, the NOC should be managed by skilled engineers 24x7. This reduces the need for problem escalation and gives customers the assurance that their problems are being handled by a competent support staff. There should be enough overlap between shifts to provide a full handoff of ongoing issues, creating a seamless service for customers. 3. Redundancy and Scale It is equally important to plan for Disaster Recovery (DR). Systems should be redundant with the backup systems residing at a different physical location than the primary infrastructure. In addition to having system level redundancy, NOC site-level redundancy should also be considered. A NOC could be temporarily decommissioned due to an act of nature, fire or some other sort of catastrophe. Having a contingency plan for such an event, however unlikely, should be part of the DR plan. A Study in Best Practices NTT America knows that when its customers need support for their IP transit, security, content delivery or other services, they want it right away. They don t have time to weave their way through IVRUs, phone trees, overseas transfers and layer-upon-layer of call center escalations. Because of this, NTT America has created a best-in-class NOC to improve the performance, reliability and costeffectiveness its network. Support Unlike some providers, support calls for NTT America Global IP Network (GIN) ring directly into its Dallas-based NOC, which is staffed by engineers who can most often quickly triage and resolve issues themselves. Nearly half of the NOC staff have 10+ years experience in the industry. As a result, customers receive a better customer service because they are not calling into a phone queue, a triage center or being handed off to additional support groups during the process. Nothing is more aggravating than having to re-explain your problem because of unqualified people initially answering the phone. NOC engineers are available 24x7x365 to provide direct customer support, in addition to around-the-clock monitoring and network support. Since there are no phone hand offs, lower Mean Time to Resolution (MTTR) of issues are achieved. NTT America Page 3

Maintenance The maintenance policy for an IP transit provider s customer base should be clearly defined, with a set number of days for advance notice of maintenances or a designated maintenance day. There should be a standard time frame for maintenances to take place. An emergency situation, which threatens a customer-impacting event unless steps are taken immediately, is an exception to this policy. NTT America s GIN maintenance policy adds reliability and stability for its customers. Maintenance work is scheduled with five days advance notice to all affected customers. Urgent maintenance procedures may not allow for as much advance notice, but they will still occur within the regularly scheduled maintenance window of 2:00 a.m. to 5:00 a.m. local time. A notification tool ensures all customers are automatically notified based on the device selected for maintenance. This allows customers to prepare in advance and ensures they are not blind sided by work being performed. Before a maintenance begins, procedures are discussed with everyone involved. Maintenance procedures, being the product of years of best practices and past lessons learned, are clearly defined and cover pre-maintenance, the maintenance itself and post-maintenance steps. The procedures have been optimized to ensure the least amount of disruption while at the same time ensure nothing gets missed. Additionally, in-house tools are leveraged to improve oversight. For example, when reloading a router, part of the NOC s procedure is to take a NOCs are also typically the front line for customer support on a wide range of issues, including emergencies such as DDoS attacks, loss of connectivity and security issues. When choosing a Tier-1 IP transit provider, potential customers should consider the provider s NOC strength and capabilities as a crucial factor. digital snapshot of the state of interfaces, protocols and connections. Then, once the maintenance is complete, a snapshot is taken again and automatically compared to make sure everything has been restored. Finally, NTT America network engineers verify traffic levels via a traffic grapher to ensure everything has restored to pre-maintenance status. NTT America Page 4

Redundancy In addition to NTT America GIN s NOC in Dallas, the company also maintains a Regional GIN NOC in Malaysia. The Regional GIN NOC has been modeled after the Dallas GIN NOC, right down to the same desktops. A satellite NOC such as this ensures that if the Dallas GIN NOC were temporarily decommissioned due to natural disaster, the Malaysia NOC could continue to monitor the network until the Dallas site was back up and running. Ticketing The NTT NOC ticketing system is easy and quick to use. Tickets are updated automatically from the team s communication (email) so no information is lost. The system is tied to the NMS, the configuration tool and the customer database so tickets can be automatically populated with key information. This ensures all correspondence and information regarding an incident is collected should follow up or a post mortem review be necessary. Post mortem reviews allow the NOC to confirm procedures for accuracy or improvement, whether more training is necessary, or some other issue needs addressing. Crucial to the ticketing process is ticket handling. One of the worst things that can happen during a critical time is for a ticket to fall through the cracks during a shift change. This is where seasoned NOC management team really makes a difference, ensuring the entire team consistently communicates with each other, seamlessly transitions between shifts and that process and procedures are closely followed. Finally, the ticketing system should provide a way for easy reporting of key metrics, useful for proactive planning in the operations of a large scale network. Configuration The NTT America NOC s configuration tools enforce standardization, providing for uniform installation, operation and decommissioning of devices. Having standard device configurations network-wide ensures that the team can efficiently support a customer regardless if that customer connects in Hong Kong, Amsterdam or Atlanta, Georgia. The tools have built-in safeguards against configuration errors. Besides accuracy, they also provide efficiency. For instance, part of the tool set automatically updates customers Border Gateway Protocol (BGP) prefix lists on a nightly basis no human intervention required. This allows NTT America to provide security to all its customers from rogue BGP announcements while not requiring additional human resources so they can focus on more demanding things while at the same time removing the possibility of human error typing in all those prefixes. Additionally, the tools provide an automatic backup of device configurations and a central repository for quick retrieval of information on a global scale. NTT America Page 5

Summary The Internet has long since become an integral part of how we live and do business. In many cases, a reliable network is mission critical to enterprises today. As such, it s crucial that a company s IP transit service be backed by a strong and technically sophisticated NOC. When considering an IP transit provider, businesses should strongly consider the provider s NOC management capabilities as a key element of the decision-making process to ensure the best service 24x7x365. NTT America NTT America is North America s natural gateway to the Asia-Pacific region, with strong capabilities in the U.S. market. NTT America is the U.S. subsidiary of NTT Communications Corporation, the global data and IP services arm of a Fortune Global 500 telecom leader: Nippon Telegraph & Telephone Corporation (NTT). NTT America provides world-class Enterprise Hosting, managed network, and IP networking services for enterprise customers and service providers worldwide. For additional information on NTT America, visit us on the Web at www.us.ntt.com. U.S. product information regarding the NTT Communications Global IP Network and its award winning IPv6 transit services may be found at http://www.us.ntt.net/, by calling 877-8NTT-NET (868-8638), or by emailing sales@us.ntt.net. NTT America Page 6